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Teaser 



There is a multiplication operation on points that your teachers failed to 
tell you about, either because they didn't know about it or because they 
judged it to be unimportant. But that multiplication turns out to have 
important applications in computer-aided geometric design (CAGD). Among 
other things, it provides the best labels for Bezier control points — better 
even than the labels provided by polar forms (a.k.a. blossoms). 

Let V be a finite-dimensional vector space. Everyone understands that 
it makes sense to multiply covectors, the elements of the dual space V* = 
Lm(V, R). For example, if x, y, and z are covectors, then the expression 
x 2 — 5yz denotes a quadratic form on the space V. Forms have lots of 
applications; for example, to put a Euclidean metric on V, we would choose 
a positive definite quadratic form as our measure of squared length. 

But most people don't yet realize that it also makes sense to multiply 
vectors, the elements of V itself. If p, a, and r are vectors, then the expression 
p 2 — bar denotes an object that is the dual analog of a quadratic form. Let's 
call such an object a quadratic site over V. The sites over V of all degrees 
form an algebra, dual to the well-known algebra of forms on V. 

What are sites good for? Consider, say, a cubic Bezier curve segment. It 
is the image, under a cubic function, of a closed interval on the parameter 
line, say the interval [R . . S]. The best labels for the Bezier points of that 
cubic segment are the cubic sites R 3 , R 2 S, RS 2 , and S 3 . 
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Preface 



In computer-aided geometric design (CAGD), a beautiful technology has 
emerged for manipulating algebraic curves and surfaces, associated with the 
names Bernstein, Bezier, de Casteljau, and de Boor. I have spent an embar- 
rassing fraction of the last fifteen years exploring the roots of that technology, 
trying to clarify the mathematics at its core. 

I made some progress in the late 1980s by exploiting functions that I 
christened blossoms. I later learned that those functions already had a name: 
polar forms. I also learned that, in much of my work with polar forms, 
I had been following in de Casteljau's footsteps. But putting aside issues 
of terminology and priority, the key point is that polar forms make things 
clearer. They give us a labeling scheme for Bezier control points in which 
the labels perspicuously encode the geometry. This sparked new discoveries: 
Dahmen, Micchelli, and Seidel used polar forms to construct elegant bases 
for multivariate spline spaces over arbitrary triangulations [11]. 

But I suspected early on that polar forms were not the whole truth in 
this area. To evaluate the polar form of an n-ic, we take n input points and 
combine them, with concatenation, into a sequence of length n. Surely it 
would be better to combine those n points with some flavor of multiplication, 
rather than concatenation; but what flavor? That is, how should we multiply 
two points in this context? For some years, I mistakenly believed that tensors 
would be essential in constructing the proper flavor of multiplication. I wasn't 
far wrong; one way to think of the proper multiplication is as a symmetrized 
variant of the tensor product. But there is a better way to think of it. 

Over the last few years, I finally realized that duality is the key to the 
proper multiplication on points — the duality of finite-dimensional linear 
spaces, where every linear space has an associated dual space and where the 
relationship between primal and dual is a symmetric one. How is duality 
relevant? Well, the dual of a point is a linear form; and we all know how 
to multiply two linear forms, producing a quadratic form. Suppose that we 
multiply two points using that same technique, but on the other side of the 
duality. We produce a quadratic object that is the dual analog of a quadratic 
form. Aha! That is the proper way to multiply two points in this context. 
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PREFACE 



Site is the name that I propose for the dual analog of a form. So a point 
is a linear site. The product of two linear sites is a quadratic site, just as 
the product of two linear forms is a quadratic form. Indeed, we have two 
whole algebras, each the dual of the other; the algebra of forms is familiar, 
but the algebra of sites has been heretofore unfairly ignored. By recognizing 
and exploiting both algebras, we repair the flawed symmetry between primal 
and dual in CAGD and we finally arrive at an explanation of the Bezier 
technology that feels, to me, like a whole truth. 

Lyle Ramshaw 

lyle . ramshaw@compaq. com 

May 1, 2001 



Chapter 1 
Introduction 



This monograph repairs a flaw, quite low in the conventional mathematical 
underpinnings of CAGD. Such flaws show up only rarely; so I want to begin 
by pointing out the flaw, using as little machinery as possible. 

1.1 Multiplying points 

Let A be a finite-dimensional affine space, equipped with a Cartesian coordi- 
nate system. For concreteness, let's focus on the case in which A is an affine 
plane and let's refer to the two axes in the Cartesian coordinate system for 
A as x and y. So there is a one-to-one correspondence between points in the 
plane A and pairs of real numbers. If we think of x and y as functions from 
the plane A to the reals, the coordinates of any point P in A are the real 
numbers x(P) = xp and y(P) = yp- 

1.1.1 Question 1 

Does it make sense to multiply x and yl People typically answer yes. Given 
the real- valued functions x: A — > R and y: A — > R, we can multiply them 
pointwise to get the function xy: A — > R defined by xy(P) := x(P)y(P). 

Indeed, objects like the product xy are familiar enough to have acquired 
their own name; they are quadratic forms on the plane A. Each quadratic 
form on A can be written ax 2 + bxy + cy 2 + alx + ey + / , for some six real 
coefficients a through /.' Recall that a conic section in the plane A is the 
zero-set of a quadratic form on A. Even simpler, a line in A is the zero-set 
of a linear form on A, which can be written ax + by + a* Just as we can 

^Please pardon my temporary sloppiness. More precisely, an n-form is a polynomial 
that is homogeneous of degree n, so the true quadratic form here is the homogenized 
polynomial ax 2 + bxy + cy 2 + dxw + eyw + fw 2 . See Section 1.4 and Chapter 4. 

H am being analogously sloppy; the true linear form here is ax + by + cw. 
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multiply the two coordinate functionals x and y, we can multiply any two 
linear forms: 

(1.1-1) 

(, \ / , \ / adx 2 + (ae + bd)xy + bey 2 \ 

ax + + ^ ^ +r) ( +{af+ ^ i y f + ce)y \ 

A quadratic form that is produced by multiplication in this way is rather 
special, of course; its zero-set is a reducible conic, the union of two lines. 

1.1.2 Question 2 

Given two points P and Q in A, does it make sense to multiply P and Q7 
People typically answer no. Of course, there are various special flavors of 
multiplication that arise in special contexts. 

• If P and Q were actually the complex numbers P = xp + iyp and 
Q = xq + iyQ, their complex product would be the complex number 
PQ = (x p xq - y P y Q ) + i(x P y Q + y P x Q ). 

• If P = (xp,yp) and Q = (xq,uq) were vectors in a 2-dimensional 
inner-product space, then their dot product would be the real number 
P-Q = x P x Q +y P y Q . 

• If P = (xp,yp, zp) and Q = (xq , yg, zq) were vectors in a 3-dimensional 
Euclidean space, then their cross product would be the vector P xQ = 
(y P z Q - z P y Q , z p xq - x P z Q , x P y Q - y P x Q ). 

Outside of such special situations, however, people typically don't assign any 
meaning to the product PQ of two points. 

1.1.3 The Flaw 

If you answered yes to Question 1 and no to Question 2, then the duality 
of linear algebra is broken for you. The points P and Q are elements of a 
certain linear space A that we discuss in Section 1.4, while the linear forms 
ax + by + c and dx + ey + / are elements of the dual space A*. If it makes 
sense to multiply two linear forms — and it manifestly does — then, by the 
symmetry of duality, it must make equal sense to multiply two points. 

1.1.4 The Repair 

It does indeed make perfect sense to multiply points. It was a regrettable 
oversight that your teacher failed to explain this to you. Fortunately, we can 
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correct that oversight without inventing any new mathematical techniques. 
The proper technique to use for multiplying the points P and Q is the same 
technique that we are already familiar with for multiplying linear forms: 

This rule for points is dual to the rule for linear forms in Equation 1.1-1. The 
plus signs that are missing from this rule are explained in Section 1.4, dbS £1X6 
the extra l's that appear on the left-hand side, acting like third coordinates 
for the points P and Q. 

While we don't need a new mathematical technique to multiply points, 
we do need a new name; let's refer to the dual analog of a form as a site. So 
the object denoted by Equation 1.1-2, the product PQ of the two points P 
and Q, is a quadratic site over the plane A. Points in the plane A are linear 
sites over A. There are constant sites, linear sites, quadratic sites, cubic sites, 
and so forth: a whole algebra of sites over A, dual to the well-known algebra 
of forms on A. 

Note that a quadratic site over the plane A has six coordinates, just as 
a quadratic form on the plane A has six coefficients. Thus, the product of 
two points is not itself a point, nor is it a scalar, nor a vector; rather, it is 
an object of a new type. "Quadratic site" is a name for that new type. 

1.2 Labeling Bezier control points 

By constructing the algebra of sites in parallel with the algebra of forms, we 
restore symmetry to duality, repairing the flaw pointed out in Section 1.1.3. 
But sites have another important benefit for us in CAGD: They are the key 
to the clearest labeling scheme that I know of for Bezier control points. Since 
we have been talking about multiplying two points in a plane A, let's first 
consider a quadratic Bezier triangle F(AQRS), as shown in Figure 1.1. 

Our modeled objects will sit in some affine space; let's refer to that space 
as our object space and denote it O. Suppose that the function F: A — > O 
maps the parameter plane A to some surface in O, with each coordinate of 
the varying point F(P) being given by a polynomial in the coordinates x(P) 
and y(P) of total degree at most 2 — that is, being given by a quadratic form 
on the plane A. The image F(A) is then a parametric surface in the object 
space O, out of which we are cutting a triangular surface patch F(AQRS). 
Such a surface patch is called a quadratic Bezier triangle. 

The Bezier triangle F(AQRS) has six control points, which are most 
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f(RS) 

Figure 1.1: A quadratic Bezier triangle 

clearly labeled 

f(Q 2 ) 
f(QR) f(QS) 
/(it; 2 ) f(RS) f(S 2 ). 

In these labels, the arguments to the function / are quadratic sites over the 
plane A; for example, Q 2 is the square of the point Q. Let <r 2 denote the 
squaring map on A, the map that takes each point P in the plane A to its 
square: cx 2 (P) := P 2 . (The projective completion of this map 02 is called the 
Veronese surface in algebraic geometry.) Any quadratic polynomial surface 
can be written in a unique way as an affine transform of the prototypical 
surface 02, and the affine map / in our labels is the instancing transformation 
involved when we so write the particular surface F. We thus have F(P) = 
f(a 2 (P)) = f(P 2 ), for all points P in A. 

These labels encode the geometric relationships among the Bezier points 
in a way that makes the de Casteljau Algorithm almost obvious. For the 
example point T := Q/6 + R/3 + S/2 in the triangle AQRS, Figure 1.2 
shows how the de Casteljau Algorithm computes the point F(T) = f(T 2 ) 
from the six Bezier points of the patch F(AQRS) by doing four 2-dimensional 
affine interpolations. Consider the quadratic sites Q 2 , QR, QS, and QT. 
You may not be too sure, as yet, what sites really are. But surely it must 
follow from T = Q/6 + R/3 + S/2 that QT = Q(Q/6 + R/3 + S/2) = 
Q 2 /6 + QR/3 + QS/2. Since the map / is affine, we then have f(QT) = 
f(Q 2 )/6 + f(QR)/3 + f(QS)/2, which justifies the uppermost interpolation. 
The other interpolations are justified similarly, multiplying the equation T = 
Q/6 + R/3 + S/2 by R, by S, and, for the final interpolation, by T. 
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R 




f(RS) 



Figure 1.2: Computing a point on a quadratic Bezier triangle 




Figure 1.3: Computing a point on a cubic Bezier segment 



For a Bezier curve, the parameter plane A is replaced by a parameter 
line L, and we get analogous labels by multiplying the points on L. Let the 
function G: L — > O have the property that each coordinate of the varying 
point G(P) is given by, say, a cubic form on L. The image G(L) is then a 
cubic curve in the object space O, typically twisted. From that curve, we cut 
out the Bezier cubic segment G([R . . S]). Letting T be the example point 
T := |i? + |5* on the parameter line L, Figure 1.3 shows the de Casteljau 
Algorithm computing the point G(T) = g{T 3 ) from the four Bezier points 
G(R) = g(R 3 ), g(R 2 S), g(RS 2 ), and G(S) = g(S 3 ) of the cubic segment 
G([R . . S]). The arguments to the map g are cubic sites over the line L, 
while the affine map g itself is the instancing transformation that realizes 
the particular curve G as an affine image of the prototypical cubic curve, the 
curve K3 given by Ks(P) := P 3 for all points P on L. 
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Figure 1.4: A point P in the affine plane A 

1.3 Adding points 

Let's return to the plane A of Section 1.1, with its Cartesian coordinate 
system (x,y) and its points P and Q. In Equation 1.1-2, we proposed a 
rule for multiplying P and Q, a rule in which each of the points P and Q is 
given three coordinates, rather than two; the point P, for example, has the 
coordinates (xp,yp,l). To explain where that third coordinate of 1 comes 
from, let's put aside the question of how to multiply points for a moment 
and take up the more elementary question of how to add them. Given two 
points P and Q, does their sum P + Q make sense? The answer is tied up 
with the distinction between linear spaces and affine spaces. 

If the plane A were a linear space (a.k.a. a vector space), with the point 
(0, 0) as its origin, then we could add two vectors in A simply by adding their 
x and y coordinates separately; we would have P + Q = {xp,yp) + (xq ,Vq) — 
(xp +XQ,yp +Vq)- More precisely, letting £ := (1,0) and 77 := (0, 1) denote 
the unit vectors in the x and y directions, we would have P = xp£ + yprj and 
Q = x Q £ + y Q r], and hence P + Q = (x P + x Q )£ + (y P + y Q )rj. 

But we introduced the plane A as an affine space, and we referred to its 
elements P and Q as points, rather than vectors. Recall that an affine space 
is like a linear space, but without an origin. In an affine space, the midpoint 
P/2 + Q/2 of a line segment PQ is a well-defined point, as is, for any t, the 
point (1 —t)P + tQ that lies t of the way from P to Q. But the sum P + Q 
of two points is not well-defined. 

To see why not, let C := (0, 0) be the point at the center of our Cartesian 
coordinate system for the plane A, as shown in Figure 1.4. Since the plane 
A is affine, the center point C is a point like any other; it has no special role. 
In particular, we do not have C — 0. For any point P in A, however, we do 
have the equation P — C = Xp^ + y P rj; that is, the point P differs from the 
center point C by the vector P — C, and the coordinates Xp and yp of P 
are the coefficients that express that vector P — C as a linear combination 
of the unit vectors £ and 77. So any point P in A can be expressed in the 
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a 



w 



points: w = 1 





> weighted points 



vectors: w = 0 



x 



Figure 1.5: The linearization A of the affine plane A 



form P = xp£ + ypr] + C, as a linear combination of £, 77, and C in which 
the coefficient of C is 1. That restriction on the coefficient of C explains the 
difficulty that arises when we add two points in an affine space. In the affine 
plane A, for example, we have P = 2p£ + ypr] + C and Q = xq^ + yqr] + C, 
and hence P + Q = (xp + £q)£ + (yp + yq)^ + 2C. The sum P + Q is not a 
point in A because the coefficient of C is not 1. 

1.4 Linearization 

The spaces that arise in CAGD are often affine. For example, the parameter 
space of a polynomial Bezier curve or surface is affine. Note that, when we 
used our plane A as the parameter plane of a quadratic Bezier surface in 
Section 1.2, the center point C = (0,0) played no special role. 

But linear spaces have simpler algebraic properties. They are closed under 
addition and scalar multiplication as separate operations. Also, it is linear 
spaces that have associated dual spaces. So it is worth considering whether 
we can somehow convert an affine space into a linear space. 

Fortunately, there is a well-known conversion method, called linearization 
(a.k.a. homogenization) . My teachers told me about it, and 1 hope that your 
teachers told you as well. When we linearize an affine space, we embed it 
as an affine hyperplane in a linear space of the next larger dimension. For 
example, we linearize the affine plane A = {x£ + yrj + C \ x,y G R} by 
extending it into the linear 3-space A = {x£ + yr] + wC \ x,y,w G R}, as 
shown in Figure 1.5, thereby removing the restriction that the coefficient 
of C be 1. An element p of the linearized space A has three coordinates 
p = (x p , y p , w p ) = XpC, + y p r] + w p C. Such an element p is typically called a 
weighted point, where w p is the weight. (1 prefer the term anchor; but let's 
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save that discussion for later.) A point, such as P, is a weighted point of 
weight 1; a vector, such as £ or P — C, is a weighted point of weight 0; and 
a sum of two points, such as P + Q, is a weighted point of weight 2. 

Linearization enlarges an affine space A of points into a linear space A 
of weighted points, thereby making it legal to add two points and legal to 
multiply a point by a scalar. But linearization, by itself, does not make it 
legal to multiply two points. To do that, we must enlarge the space A even 
further: into the algebra Sym(A), as we discuss in Section 1.6. 

While linearization doesn't take us all the way to the algebra Sym(A), it 
does clear up some issues that we left dangling in Section 1.1; in particular, 
it supplies the linear space to which we apply duality. Recall that we were 
discussing how to multiply two points P and Q in the affine plane A. Let A be 
the linearization of the plane A, which is the linear 3-space of weighted points 
shown in Figure 1.5. The dual of A is the space A* = Lm(A, R) = AS (A, R) 
of linear forms on A. This dual space A* is also 3-dimensional, a typical 
element of it being written either as ax + by + c, when viewed as belonging 
to AS(A, R), or as ax + by + cw, when viewed as belonging to Lin (A, R). 
Whichever way the forms in A* are written, everyone agrees that it makes 
sense to multiply them as polynomials, using the obvious rule 



ax + by\ ( dx + ey 
+ cw I \ + fw 



' adx 2 



+ (ae + bd)xy + bey 2 



+ (af + cd)xw + (bf + ec)yw 
+ cfw 2 



By duality, it makes equal sense to multiply two weighted points p and q in 
A as polynomials, using the analogous rule 



x P i +y P v 

+ w p C 



Xq£ +VqV 

+ w q c 

( x p x q £ 2 + (x p y q + y p x q )£ri + y p y q rf \ 

+ {xpWq + w p x q )£C + {y p w q + w p y q )r)C 
+ w p w q C 2 



J 



In particular, for two points P and Q in A, with weights wp = wq = 1, we 
compute the quadratic site PQ that is their product via the rule 




+VQV 

+ c 

I xpx Q i 2 + (x P y Q + y P x Q )£ri + y P y Q r] 2 \ 
+ (x P + x Q )iC + (y P + y Q )r]C 

V +c 2 J 
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This is precisely the rule for the product of the two points P and Q that we 
first saw in Equation 1.1-2; but the missing plus signs have now been filled in, 
revealing polynomials in £, rj, and C, while the extra l's have been revealed 
to be weight coordinates. 



Linearization solves the problems of addition and scalar multiplication, but 
not the problem of multiplication. To make sure that we understand the 
multiplication problem on its own, free from any extraneous issues associated 
with linearization, let's go through Section 1.1 again, but considering linear 
spaces from the outset. That is, instead of multiplying the points in an affine 
space, let's consider multiplying the vectors in a linear space. 

Let X be a finite-dimensional real linear space, say of dimension k and 
with basis (£i, . . . , So every vector ( in X is a linear combination ( = 
x iCi + • • • + Xkik of the basis vectors £x through Furthermore, the real 
coefficients X\ through x^ in that linear combination are uniquely determined 
by (. Writing them as functions of (, we have 



covector, an element of the dual space X* = Lin(X, R). Indeed, the covectors 
(xi, . . . , Xk) form the basis for X* that is dual to the basis (£i, . . . , for X; 
that is, we have the duality constraints 



It follows that any covector z in X* can be written uniquely as a linear 
combination of the basis covectors X\ through x k as follows: 



Equation 1.5-2 is dual to Equation 1.5-1. The two would look more 
alike if we had written Equation 1.5-2 as z = ^i(z)x 1 + • • • + ik{z)x k . We 
chose to write the i th coefficient as rather than as £i(z), because people 
typically prefer to think of covectors as functions that take vectors as their 
arguments, rather than vice versa. In fact, those two points of view are 
equally valid. To avoid choosing between them, we can view both the vector 
and the covector, more symmetrically, as arguments to a pairing map, the 
bilinear map ( , ) : X* xI^R that takes a covector z and a vector ( to 
the real number (z,() — Z (C) = C( z )i we discuss this more symmetric point 
of view in Section 2.3. 

Just as in Section 1.1, we are now faced with two questions: 



1.5 Multiplying vectors 




R is linear; hence, OCi IS 8b 




(1.5-2) 



z = z(£i)zi H h z(£ k )xk- 
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Question 1: Does it make sense to multiply two covectors, say x\ and x 2 ! 
Sure. The product x\x 2 is a quadratic form on the vector space X. 
We can think of forms on X either syntactically or semantically. Syn- 
tactically, a form is simply a polynomial in the variables X\ through 
Xk- Semantically, it is the real- valued function on X that results when 
the multiplications in such a polynomial are interpreted as pointwise 
multiplication of functions. For example, the quadratic form X\X 2 on 
X is interpreted semantically as the function X\X 2 : X — > R given, for 
all C in X, by x^iQ ■= x 1 (()x 2 ((). 

Question 2: Does it make sense to multiply two vectors, say £i and £ 2 ? By 
duality, the answer must be yes. The product £i£ 2 is a quadratic site 
over X. We can think of sites over X either syntactically or seman- 
tically. Syntactically, a site is simply a polynomial in the variables £i 
through Semantically, it is the real-valued function on X* that re- 
sults when the multiplications in such a polynomial are interpreted as 
pointwise multiplication of functions. For example, the quadratic site 
£i£ 2 is interpreted semantically as the function £i£ 2 : X* — > R whose 
value, at any covector z in X*, is given by £i£ 2 (z) := £i(z)£ 2 (z) = 

When these theories get applied to practical situations, people tend to be 
more interested in real-valued functions on X than they are in real-valued 
functions on X*, that is, more interested in forms than in sites. Indeed, forms 
are used extensively in many fields; in CAGD, for example, each coordinate 
of a parametric curve or surface is a form on the parameter space. Sites, on 
the other hand, have been used so little that they do not yet have a standard 
name. This monograph argues that sites are important in CAGD because 
they give us the best labels for Bezier points. 

1.6 The paired algebras 

We are going to repair the flaw in the underpinnings of CAGD pointed out in 
Section 1.1.3 as follows: Given any affine space A, we are going to supplement 
the well-known algebra of forms on A with the algebra of sites over A, thus 
producing a dual pair of algebras. Doing this takes three steps. 

1.6.1 Linearization 

The first step is the familiar process of linearization, as we discussed in 
Section 1.4. Linearizing the affine space A extends it into a linear space A 
of the next larger dimension. Once we have produced the linear space A, we 
get its dual space A* automatically, since every linear space has a dual. 
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1.6.2 Algebrization 

The second step is similar in structure; let's call it algebrization} Just as 
linearization extends an affine space A into a naturally associated linear 
space A, so algebrization extends a linear space X into a naturally associated 
commutative algebra. Viewed abstractly, this algebra is called the symmetric 
algebra of X and is denoted Sym(X). 

Actually, multilinear algebra provides at least four ways to algebrize a 
linear space X, that is, to extend X into a naturally associated algebra of 
some flavor [25, 36]: 

• the tensor algebra T(X) = ®X, 

• the symmetric algebra Sym(X) = S(X) : 

• the alternating (a.k.a. skew-symmetric, exterior, or Grassmann) 
algebra Alt(X) = AX, 

• and, if a quadratic form on X has been chosen, thereby giving X a 
metric structure, the Clifford algebra Clif(X). 

This monograph uses the symmetric algebra. Luckily, that one is the simplest 
of the four: the only one whose multiplication commutes and the only one 
that can be constructed using just polynomials, with no need for tensors. 

By the way, many of these algebras have important applications in CAGD. 
Starting at the bottom, Clifford algebras have proven helpful in analyzing 
Pythagorean-hodograph (PH) curves — a problem in which the Euclidean 
metric plays a central role [10]. Alternating algebras have long been widely 
used; they give a good naming scheme for the subspaces of a linear space, 
and they underlie calculus on manifolds. Symmetric algebras have an even 
longer history, though they are seldom referred to by name; for example, the 
algebra of all forms on an affine space A is the symmetric algebra Sym(A*). 
This monograph argues that we in CAGD should supplement that famous 
algebra with its dual, the symmetric algebra Sym(A) of sites over A. As for 
the tensor algebra, I can't think of any application of the full tensor algebra 
in CAGD; but the multiplication in the tensor algebra is the asymmetric 
tensor product, which is what the phrase "tensor-product surface" refers to. 

The symmetric algebra Sym(X) can be constructed in various ways, as 
we discuss in Chapter 5. One simple way uses polynomials: We choose 
a basis for the linear space X, say (£i, . . . and we then construct the 
symmetric algebra Sym(X) as the algebra R[£i, . . . , £&] of all polynomials 

§ The verb should mean "to convert something into an algebra" , rather than "to make 
something more algebraic" ; this argues in favor of "algebrize" or "algebratize" , rather than 
"algebraicize" or "algebraify" . I am not fond of "algebrize" ; but I like "algebrization" , 
and I can't justify forming "algebratize" in the absence of the adjective "algebratic" . 
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in the symbols £1 through treated as variables. But choosing a basis as 
part of the construction, in this way, raises the fear that different choices 
might lead to algebras that differed in some important way. A more abstract 
and basis-independent way to construct the symmetric algebra Sym(X) is as 
the algebra Poly(X*, R) of all real-valued, polynomial functions on the dual 
space X*. A vector in X can be thought of as a linear functional on X*, 
so a polynomial whose variables are vectors in X gives rise to a real-valued, 
polynomial function on X*. 

As our second step in building the paired algebras, we apply this process 
of algebrization, independently, to the linear spaces A and A*. We get two 
algebras with the same structure, one of which is an old friend: 

Sym(i) = Poly(i*,R), Sym(i*) = Poly(i,R), 

the algebra of sites over A the algebra of forms on A. 

1.6.3 Choosing the pairing maps 

Only one step remains, in building the paired algebras; but first, we need to 
discuss homogeneity. Given any vector space X, an element of the symmetric 
algebra Sym(X) is called homogeneous of degree n when the corresponding 
polynomial in R[£i,...,£fc] has every term of total degree precisely n or, 
equivalently, when the corresponding real- valued function / : X* — > R satis- 
fies f(tz) = t n f(z), for all covectors z in X* and all real numbers t. In the 
symmetric algebra Sym(X), the elements that are homogeneous of degree 
n form a linear subspace, which we denote Sym n (X). (Some authors use a 
superscript: Sym n (X).) In our situation, those forms on A that are homo- 
geneous of degree n constitute the linear space Sym n (A*), while those sites 
over A that are homogeneous of degree n constitute Sym n (A). We refer to 
the elements of these spaces as n-forms on A and n-sites over A. 

So far, we have constructed the algebra of forms Sym(A*) and the algebra 
of sites Sym(A) as separate algebras, built from the dual linear spaces A* 
and A. Our third step makes those two algebras themselves into a dual pair 
by choosing, for each n, a pairing 

( , ): Sym n (i*) x Sym n (i) - R 

between n-forms and n-sites. Fixing such a pairing lets us represent a linear 
functional on n-forms as an n-site, and vice versa. For example, consider 
the evaluate-an-n-form-at-P functional, the dual functional ep in Sym n (A*)* 
that takes an n-form / in Sjm n (A*) as its argument and returns the real 
number ep(f) := f(P). With the pairing maps that I recommend, that linear 
functional is represented by the n-site ep = P n /n\ . Warning: Some authors 
scale their pairing maps differently, so as to eliminate that denominator of 
n! . By doing so, they simplify their formulas for evaluation, but complicate 
their formulas for differentiation — as we discuss at length in Appendix B. 
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1.7 Piecewise models with smooth joints 

Once we have built the paired algebras of forms and sites, what good are 
they? In a nutshell, they provide a tool for analyzing functions defined 
by polynomials. This tool is particularly effective at constraining two such 
functions so that they agree to a certain order somewhere. And that, in turn, 
is a key problem in spline theory, as we here review. 

Computer-aided geometric design (CAGD) is a field of applied mathe- 
matics that studies ways to model and manipulate smooth, synthetic shapes. 
The standard techniques in CAGD involve breaking a shape up into pieces 
and modeling each piece algebraically. The word spline originally meant a 
flexible strip of wood; but it now refers to a great variety of clever ways to 
arrange that the joints between the pieces end up sufficiently smooth. 

Suppose that O is our object space, the space in which we want our mod- 
eled shapes to sit. In CAGD, we typically model shapes in O either para- 
metrically or implicitly. To model a shape S in O parametrically, we invent 
for ourselves an auxiliary space A, called the parameter space; we choose a 
function T: A — > O, typically piecewise rational; and we then model S as 
the image S := J- {A). To model a shape S in O implicitly, we invent an aux- 
iliary space B, which might be called the gauge space; we choose a function 
Q: O — > B, typically piecewise polynomial; and we model S as the inverse 
image S := {? _1 (0) of the origin in B. The ideas in this monograph are appli- 
cable to both parametric and implicit modeling. But parametric models of 
shapes are more common in CAGD today, so we shall use parametric models 
as our examples. When the parameter space A is 1-dimensional, the result- 
ing shape J 7 (A) is a parametric curve; when dim(A) = 2, it is a parametric 
surface; when dim(A) = d, it is a parametric d-fold. 

The parametric rf-folds used in CAGD are typically either piecewise- 
rational or piecewise-polynomial, the latter being a special case of the former. 
If T : A — > O is a piecewise-polynomial parametric d-fold, then the spaces A 
and O are taken to be affine. For a polynomial piece F of say of degree 
n, each coordinate of the output point F(P) in O is given by an n-form on 
A. The piecewise-rational case is similar, except that we add one additional 
n-form, serving as a common denominator. More precisely, the spaces A 
and O are taken to be projective and each homogeneous coordinate of the 
output point F(P) is given by an n-form on A. In Section 4.6, we mention 
how one completes an affine space A into its projective closure by ignoring 
scalar multiples in the linearized space A. For the bulk of this monograph, 
however, we restrict ourselves to the polynomial case. 

Let T : A — > O be a piecewise-polynomial parametric d-fold and let F be 
one of its pieces. Since F is given by polynomials, it extends to a polynomial 
function F: A — > O, defined on all of A. For example, Figure 1.6 shows the 
graph of a function from R to R that is built up from four pieces, each a 
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Figure 1.6: A cubic spline curve with four segments 

segment of a cubic polynomial. It also shows what happens when the first 
two of those segments are extended beyond their endpoints. (The other two 
are symmetric.) In this example, each adjacent pair of polynomials agrees to 
second order at the joint between them, leading to an overall curve that is 
twice continuously differentiable. 

One of the key problems in spline theory is achieving smooth joints. The 
paired algebras assist in that quest through the following result, which we 
discuss in Sections 6.7 and 7.11: 

For any k in [0. .n], two n- forms / and g on an afline space A agree 
to k th order at a point P in A just when we have (/, s) = (g, s) 
for all n-sites s over A that are multiples of p n - k . (The angle 
brackets here denote the pairing between n- forms and n-sites.) 

The full implications of this result are subtle, but we can easily check out the 
extreme cases. Letting k := n, the two n-forms / and g agree to n th order 
at P just when (/, s) = (g, s) for all n-sites s over A that are multiples of 
P° = 1, that is, for all n-sites s over A. Thus, / and g agree to n th order at 
P just when they coincide. Letting k := 0, the forms / and g agree to 0 th 
order at P just when (/, s) = (g, s) for all n-sites s that are multiples of P n , 
that is, when (f,P n ) = (g,P n ). This also makes sense, since we have seen 
that (f,P n )/n\ = (f,P n /n\ ) = e P (f) = f(P), and similarly for g. 

1.8 Cubic Bezier triangles 

It turns out that polynomial parametric surfaces in 3-space of total degree at 
most 3 are general enough to motivate much of what we do. In what follows, 
we shall often use that class of surfaces, called cubic Bezier triangles, as a 
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convenient example. Indeed, we have already used quadratic Bezier triangles 
as an example several times, including the one shown in Figure 1.1. But 
quadratics are a bit too special; cubics are more generic. 

Let the object space O be affine 3-space, say with (x, y, z) as a Cartesian 
coordinate system, and suppose that we want to design a cubic polynomial 
parametric surface in O. We invent for ourselves an affine parameter plane A. 
Since we just agreed to use x, y, and z as the coordinates in the object space 
O, let's use u and v from now on as the names of the Cartesian coordinates 
in the parameter plane A. We define the x, y, and z coordinates of the 
varying point F(u,v) = (F x (u, v ), F y (u, v ), F z (u, v )) as polynomials F x , F y , 
and F z of total degree at most 3 in the variables u and v. The resulting 
function F: A — > O is called a cubic polynomial parametric surface. The 
piece typically cut out of such a surface is a cubic Bezier triangle, the image 
F(AQRS) of a triangle AQRS in the plane A. The analog, for arbitrary 
degree n and parametric dimension d, is an n-ic polynomial parametric d-fold, 
out of which we cut an n-ic Bezier d-simplex. 

1.9 Related work 

We now discuss how the paired algebras relate to other work in CAGD, using 
a cubic Bezier triangle F(AQRS) as our example. 

1.9.1 Bernstein bases and Bezier points 

Bernstein bases and Bezier control points provide the common foundation 
for much of CAGD. Any point P in the parameter plane A can be written 
uniquely as a barycentric combination of the three points Q, R, and S; that 
is, we have P = q(P)Q + r(P)R + s(P)S, where q, r, and s are affine, real- 
valued functions on A and where q(P) + r(P) + s(P) = 1, for all points 
P in A. It then transpires that every cubic polynomial parametric surface 
F : A — > O can be written uniquely in the form 



(1.9-1) 




for some ten control points 




in O. The factor of in the summand is the trinomial coefficient given, 

for i+j+k = n, by Q ™ fc ) = n\/{i\ j\ k\). The ten control points (i^j.fc) are the 
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Bezier points of the Bezier triangle F(AQRS), while the ten corresponding 
coefficient functions B^ k : A — > R given by 



constitute the Bernstein basis for the cubic polynomial functions on the plane 
A with the reference triangle AQRS. 

Things are much the same for any degree n and parametric dimension d. 
Given a reference <i-simplex for a (/-dimensional parameter space A, we get a 
Bernstein basis for the n-ic polynomial functions on A, and the coefficients of 
the functionals in that Bernstein basis are the Bezier points of the resulting 
n-ic Bezier <i-simplex. 

Assembling a spline rf-fold out of pieces cut from polynomial rf-folds is 
quite a subtle problem, once d exceeds 1. In the case of d = 1, however, 
de Casteljau, de Boor, and others built B-splines, a thoroughly satisfactory 
theory of spline parametric curves. Indeed, this theory of spline curves is 
so attractive that it is tempting to construct spline surfaces as curves of 
curves. The resulting surfaces are built from functions F: A — > O whose 
defining polynomials obey separate degree bounds in u and in v; we discuss 
these tensor-product surfaces in Section 6.8. Because B-splines are such an 
effective way of dealing with spline curves, tensor-product spline surfaces 
have become the most popular surfaces in CAGD. 

1.9.2 Bezier points as polar values 

In de Casteljau's development of the theory of B-splines [14], he made good 
use of the classical notion of polar forms, referring to his B-spline control 
points as poles. I popularized his ideas under the name blossoming [42, 43]. 
The polar form, a.k.a. blossom, of a cubic Bezier triangle F: A — > O is the 
unique symmetric, triaffine function F: A 3 — > O that agrees with F on the 
diagonal, that is, that satisfies F(P, P, P) = F(P), for all points P in A. 

Polar forms are valuable in this context because they give us perspicuous 
names for many of the points that are associated with the surface F, but that 
don't lie on that surface itself. In particular, the ten Bezier control points 
{Fi,j,k)i+j+k=z of the triangular patch F(AQRS) are the following values of 
its polar form F: 




F(Q,Q,Q) 



F(Q,Q,R) 



F(Q,Q,S) 



F(Q,R, R) 



F(Q,R,S) 



F(Q,S,S) 



F(R, R, R) 



F(R, R, S) 



F(R, S, S) 



F(S, S, S) 
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Furthermore, the symmetric, multiaffine nature of the polar form F neatly 
encodes the geometry that underlies the essential algorithms, such as the 
de Casteljau Algorithm for subdivision. 

Polar forms have proven useful to researchers in spline theory, as well as 
to teachers of it. By using polar forms, Dahmen, Micchelli, and Seidel [11] 
constructed elegant bases for multivariate spline spaces over arbitrary trian- 
gulations, thereby taking an important step toward generalizing the theory 
of B-splines from curves to surfaces. 

By the way, the name "polar form" is good, because it points out the 
connection with the other places in mathematics where the technique of 
polarization is exploited. But the word "form" is used quite heavily already 
in this area of mathematics, most notably for the objects — quadratic forms, 
cubic forms, and the like — that make up the algebra of forms. In this 
monograph, simply to reduce our overloading of the word "form", let's refer 
to F as the blossom of F. 

1.9.3 From polarization to the paired algebras 

Even in my early work on blossoming, I suspected that the n arguments to 
the blossom should be combined using some flavor of multiplication, rather 
than simply being concatenated into a sequence. But I wrongly believed 
that the symmetrized variant of the tensor-product construction would be 
an essential tool in defining the proper way to multiply points. 

Ron Goldman's pioneering work on dual bases [23, 38] pushed me to 
think harder about duality, since I found it disturbing when he referred to 
two different bases for the same linear space as dual. I then realized that I 
had been mistaken: You don't need tensors to multiply points. Points are 
dual to linear forms, and you certainly don't need tensors to multiply forms. 
Rather, forms are essentially polynomials, and you multiply them as you 
would polynomials. This monograph argues that points are also essentially 
polynomials — to wit, linear sites. And the proper way to multiply sites is 
as you would multiply polynomials. 

The ability to multiply points together to form sites provides a frame- 
work for the Bernstein/Bezier theory that is clearer and more convenient 
than blossoming. For example, the function that maps each point P in the 
plane A to the cubic site P 3 over A becomes a prototype for all possible 
cubic parametric surfaces. In algebraic geometry, that prototype is called 
the Veronese surface of parametric degree 3 [29]. That Veronese surface sits 
in a space of fairly high dimension — in fact, in a 9-space. But every cubic 
surface F: A — > O in the 3-space O is simply an affine transform of that pro- 
totype; that is, we have F(P) = f(P 3 ), for all points P in A, where / is an 
affine transformation from the 9-dimensional space Sym 3 (yl)^ of unit-weight 
3-sites over A to the 3-space O. 
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Exploiting sites and the affine transformation /, we get the following 
simple formulas for the ten Bezier points of the cubic patch F(AQRS): 

f(Q 3 ) 

f(Q 2 R) f(Q 2 S) 
f(QR 2 ) f(QRS) f(QS 2 ) 
/(it! 3 ) f(R*S) f(RS*) /(5 s ) 

Comparing this notation for the Bezier points to our previous two notations, 
we have 

F hhk = F(Q 1 _^,R 1 _^,S 1 _^S) = }\Q l WS k ) 

i j k 

whenever i + j + k = 3. The right-hand, site-based notation preserves all of 
the symmetric, multiaffine strengths of the middle, blossom-based notation, 
while restoring the brevity of the left-hand notation, in which the Bezier 
points are simply numbered. 

The notation and concepts of the paired algebras are more powerful than 
their predecessors, as well as more concise. As an example of this power, 
consider Equation 1.9-1, the basic formula that expresses a point F(P) on a 
cubic Bezier triangle as an affine combination of the ten Bezier points. Using 
sites, we can prove that formula with elementary algebra: 

F(P) = f(P 3 ) = f((q(P)Q + r(P)R + s(P)S) 3 ) 

= f( E { i ? '.\{P) i r{P) j s{P) k Q i R j S k 

\ i+j+k=3 ^ 3 ' 

= E (*i)^ l riPYs(P) k f(Q l R 3 s k ). 

i+j+k=3 ^ 3 ' 

1.9.4 Vegter exploits the contraction operators 

The basic operators that interconnect the algebra of forms with the algebra 
of sites are the pairing maps; for each n, we can pair an n-form with an n-site 
to produce a real number. But there is also a richer family of interconnecting 
operators that can be defined from the pairing maps: the contraction oper- 
ators [22, 25], which we discuss in Section 7.8. For any k in [0 . . n], we can 
contract an n-form on a fc-site to produce an (n — fc)-form. Symmetrically, 
we can contract an n-site on a A;-form to produce an (n — fc)-site. Pairing 
is the special case k = n of either of these flavors of contraction, since both 
0-forms and 0-sites are simply real numbers. 

Recently, Gert Vegter has been applying these contraction operators to 
problems in CAGD [47]; at least, that is how I would describe what he has 
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been doing. He describes his work as applying the apolar bilinear form, an 
inner product on spaces of homogeneous multivariate polynomials that was 
used in 19 th -century invariant theory. I hope that Vegter will come to view 
the paired algebras as providing a cleaner foundation and a simpler notation 
for his fine work. Meanwhile, I view his work as encouraging evidence that 
the paired algebras will be broadly useful in CAGD, above and beyond giving 
us the clearest names for Bezier points. 

Warning: I regret to report that Vegter uses the family of pairing maps 
that I first used, as opposed to the family that I now recommend. In the 
language of Appendix B, he uses the averaged pairing, while I currently 
recommend the summed pairing. The field of CAGD will avoid a lot of 
confusion if a clear winner emerges soon on this annoying question of where 
to put the factor of n\ . 

1.10 The four frameworks 

The bulk of this monograph analyzes four different frameworks that can be 
used when studying problems in CAGD — for example, when devising new 
spline methods. Each of those frameworks gives names to the relevant linear 
spaces, stipulates various relationships between those spaces, and provides 
certain operators that interconnect those spaces. 

The nested-spaces framework: After some mathematical preliminar- 
ies in Chapter 2, Chapter 3 discusses the nested-spaces framework, shown 
schematically on page 29. This is the naive framework that people often 
adopt when they first start working in CAGD. Such people typically view 
a quadratic polynomial, say in the variables u and v, as being a degenerate 
case of a cubic polynomial — degenerate in the sense that the coefficients of 
the u 3 , u 2 v, uv 2 , and v 3 terms all happen to be zero. Thus, in this frame- 
work, the 6-dimensional linear space of all quadratic polynomials in u and v 
is viewed as a subspace of the 10-dimensional linear space of all cubics in u 
and v. That is the sense in which the spaces in this framework are nested. 

The homogenized framework: Linearization leads to the homogenized 
framework, discussed in Chapter 4 and shown on page 41. This is the frame- 
work commonly used by researchers in CAGD today. They homogenize their 
polynomials; for example, rather than dealing with u 2 — 3uv + 7v either as 
a quadratic polynomial in u and v or as a degenerate cubic in u and v, they 
instead deal either with the quadratic form u 2 — 3uv + 7v w or with the cubic 
form u 2 w — 3uv w + 7vw 2 , where the weight variable w lets them express the 
point (u,v) in homogeneous coordinates as [u : v : w]. The resulting forms 
make up an algebra, the symmetric algebra Sym(A*) of forms on A. 
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The separate-algebras framework: In the separate- algebras framework, 
discussed in Chapter 5 and shown on page 50, the algebra Sym(A*) of forms 
on A is joined by its dual, the algebra Sym(A) of sites over A. But these two 
algebras remain separate, in the sense that we have not yet chosen a family 
of pairing maps; so we can't combine an n-form on A with an n-site over A to 
produce a real number. Even without such pairing maps, a framework that 
embraces the algebra of sites along with the algebra of forms has significant 
advantages. Chapter 6 attacks various questions in CAGD by analyzing 
the Veronese prototypes, the images of the perfect-power maps P \— > P n , as 
geometric objects sitting inside the algebra of sites. 

The paired-algebras framework: By choosing, for each n, a pairing 
between n-forms and n-sites, we arrive at our final goal: the paired- algebras 
framework, discussed in Chapter 7 and shown on page 82. The pairing maps 
and the contraction maps defined from them give us simple formulas for 
the evaluation and differentiation of n-forms. Chapter 8 reviews some basic 
concepts of CAGD in the light of the paired algebras, showing, among other 
things, that the dual of a Bernstein basis for the linear space Sjm n (A*) of 
n-forms is a Bezier basis for the linear space Sym n (A) of n-sites. 

Unfortunately, a question of convention raises its ugly head in defining the 
pairing maps: Do we divide by n! or not? Each choice makes some formulas 
pretty, at the price of cluttering up others. I recommend not dividing by n! , 
which makes differentiation pretty at the price of cluttering up evaluation. 
Appendix B discusses the tradeoffs at length. 

Universal mapping conditions: By this point in your reading, I hope 
to have convinced you that sites are important and useful in CAGD. But 
you may still feel uneasy about what sites really are — that is, what it 
really means to multiply points. One way to address that uneasiness is to 
apply duality strictly: If you are happy thinking of forms on A as real- 
valued functions on A of a certain type, then you should also be happy 
thinking of sites over A as real- valued functions on A* of the analogous type. 
But universal mapping conditions provide a truer and deeper answer to the 
question of what sites really are. In Chapter 9, we discuss how to construct 
symmetric algebras, as well as tensor algebras, alternating algebras, and 
Clifford algebras, by means of universal mapping conditions. 

If universal mapping conditions don't scare you off by being too abstract 
and formal, you might consider the next step toward formalized abstraction, 
which is category theory. Viewed from the perspective of category theory, 
both the linearization of affine spaces and the algebrization of linear spaces 
are left adjoints of forgetful functors. Appendix A discusses the mathematics 
that underlies this monograph from that still more abstract perspective. 



Chapter 2 

Mathematical Preliminaries 



2.1 On the words "affine" and "linear" 

The word "linear" is used inconsistently in mathematics: It sometimes im- 
plies homogeneity and sometimes doesn't. For example, the polynomial 
f(x) := ax + b is called linear even when b is nonzero; but we must have 
b = 0 in order for the function /: R — > R defined by f(x) := ax + b to 
qualify as a linear map. We here adopt the convention that linear always 
implies homogeneous; when we mean "of degree 1, but not necessarily homo- 
geneous", we use the term affine. For example, we say affine interpolation, 
where most people would say "linear interpolation" . 

We use the term linear space for the mathematical structure that is often 
called a vector space. While some linear spaces do indeed have vectors as 
their elements, many linear spaces have elements of other types: covectors, 
polynomials, or functions, for example. 

An affine space is like a linear space, but without an origin. If Pi through 
P m are points in an affine space A, the linear combination t\P\ H — • + t m P m 
denotes a point in A only when t± + ■ • ■ + t m = 1. Linear combinations 
whose coefficients sum to 1 in this way are called affine combinations. If A 
and B are affine spaces, a map /: A — > B is affine when it preserves affine 
combinations, that is, when ti + - — h t m = 1 implies f{t\P\ H — • + t m P rn ) = 
tif(Pi) + --- + t m f(P m ). 

2.2 Finite dimensionality 

For simplicity in our mathematical constructions, we restrict ourselves to the 
case of finite-dimensional spaces, either affine or linear, over the real numbers. 
That is the case of primary interest in CAGD, and it is also the case in which 
the theories of duality and of the symmetric algebra are at their simplest 
and prettiest. Much of those theories carries over to more general contexts: 
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linear spaces that are infinite- dimensional, linear spaces over fields of prime 
characteristic, even modules over commutative rings. But various intriguing 
subtleties arise in those wilder contexts, as we discuss in paragraphs labeled 
"Math remark" and in Appendix C. 

Actually, finite-dimensional linear spaces over the complex numbers are, 
in some ways, even better behaved than those over the real numbers, partic- 
ularly when factoring is involved. For example, before investigating whether 
a form or a site factors over the real numbers, it is often helpful to consider 
the easier question of whether it factors over the complex numbers. 

2.3 Linear-space duality 

Recall that the duality of linear algebra, in the finite-dimensional case, is 
a symmetric relationship between pairs of spaces. Let X and Y be linear 
spaces (a.k.a. vector spaces). The set of all linear maps /: X — > Y is another 
linear space, written Lm(X,Y). In the particular case Y — R, linear maps 
/: X — > R are called linear junctionals on X (a.k.a. dual junctionals), and 
the space Lin(X, R) of all such linear functionals is called the dual of X 
and written X*. Repeating that same construction, the linear space of all 
second-order maps o~: X* — > R is X** = Lin(Lin(X, R), R), the dual of the 
dual of X. There is a natural map from X to X** that takes an element 
i of X to the second-order functional e x defined by e x (j) := j(x), that is, 
to the functional that evaluates its first-order argument / at the datum x. 
This natural map is always injective. When dim(X) is finite, the equality 
dim(X) = dim(X*) = dim(X**) implies that it must be surjective as well; 
that is, every second-order functional a is the evaluate-at-x functional a = e x , 
for a unique a; in X. The spaces X and X** thus being isomorphic in a natural 
way, it does no harm to identify them. So, in the finite-dimensional case, 
duality is a symmetric relationship between pairs of spaces. For example, 
if we represent the elements of X using column vectors, we then represent 
the elements of X* using row vectors and the elements of X** using column 
vectors once again. 

That explanation of duality is standard in the textbooks; but it has the 
drawback that it treats the spaces X and X* somewhat asymmetrically. We 
viewed an element / of the dual space X* as a first-order functional, while 
we viewed an element x = e x of the primal space X = X** sometimes as the 
datum x and sometimes as the corresponding second-order functional e x . 

The concept of a pairing map puts the primal and dual spaces on a more 
equal footing. Suppose that X and Y are linear spaces of the same finite 
dimension k; so, choosing bases, we can think of an element x in X as a 
column vector of length k, and the same for an element y in Y. Any bilinear 
map B: X x Y — > R then has an associated k-bj-k matrix M, under the 
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convention that the scalar B(x,y) is the matrix product B(x,y) := x t My. 
The map B is called a pairing between X and Y when its associated matrix 
M is invertible. Note that every linear functional on X has the form x t— > x t p, 
for a unique column vector p of length k. If the matrix M is invertible, then 
we have x t p = x t M(M~ 1 p), so we can describe that functional equally well 
as x I— > -B(x, y) where y := M~ l p. Thus, we can use the space Y to represent 
the dual space X* . Symmetrically, each linear functional on Y has the form 
y I— > 5*7/, for a unique column vector q. If M is invertible, we can describe that 
functional equally well as y i— > B(x,y) where x := M~*g; so we can use the 
space X to represent F*. Thus, once we fix a pairing between two spaces, we 
can treat each of them as the dual of the other, without committing ourselves 
as to which of the two spaces is the primal and which is the dual. 

On the other hand, making a temporary convention about that can be 
pedagogically helpful. For example, suppose that we have fixed a pairing 
B between two linear spaces X and Y. By the way, it is conventional to 
denote pairings using angle brackets, so let's switch from writing B(x,y) to 
writing (x,y). We might call the elements of the space X vectors and think 
of them as data, while we call the elements of Y covectors and think of them 
as functions. The pairing map then produces the scalar (x, y) from the vector 
x and the covector y by applying the function y to the datum x, so we have 
(x,y) = y(x). Having broken the symmetry in this direction, we would call 
X the primal space and Y the dual space of the pair. But keep in mind that 
we could equally well have broken the symmetry in the opposite direction, 
treating x as the function and y as the input datum, with (x, y) = x(y). The 
underlying reality is symmetric, with both x and y as input data; we break 
that symmetry only because an asymmetric situation, with a datum on one 
side and a function on the other, is often easier to talk about. 

Warning: Different frameworks for CAGD make it pedagogically natural 
to break the symmetry between primal and dual in different directions. A 
linear space of n-forms, for example, is typically thought of as a primal space 
in the homogenized framework. In the paired-algebras framework, on the 
other hand, it is more natural to view the space of n-sites as primal, which 
forces the space of n-forms to be dual. When the words "primal" and "dual" 
are used as identifiers in this way, only the context can clarify which is which 
- that is, can clarify the direction in which the symmetry is being broken. 

2.4 Algebras 

In this monograph, we are going to be extending afline spaces into linear 
spaces and linear spaces into commutative algebras. We here review the 
mathematical concept of an algebra. Feel free to skip this section on first 
reading, referring back to it only as needed. 
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Fix some field of scalars. In this monograph, that field will always be 
the real numbers R, but any field would do. An algebra over that field is 
a set G with three operations defined on it: addition, multiplication, and 
scalar multiplication. The addition and the multiplication must make G 
into a ring. (In keeping with modern practice, we require that any ring - 
and hence any algebra — have a multiplicative identity.) The addition and 
the scalar multiplication must make G into a linear space. And the two 
multiplications must satisfy t(xy) = (tx)y = x(ty), for all scalars t and all 
elements x and y of the algebra G. An algebra is commutative when its 
multiplication is commutative, that is, when xy = yx. 

For example, for any fixed n, the set of all n-by-n real matrices forms an 
algebra. The dimension of this algebra, viewed as a linear space, is n 2 . Once 
n exceeds 1, this algebra is noncommutative. 

Exercise 2.4-1 Another way to think of an algebra is as a ring that includes 
the field of scalars as a central subring. More precisely, show that the defi- 
nition of an algebra given above is equivalent to the following: An algebra is 
a ring G together with a ring homomorphism g : R — > G with the property 
that g(t)x = xg(t), for all x in G and t in R. 

Hint: Given a ring G and a homomorphism g: R — > G, we can define 
a scalar multiplication in G by the rule tx := g(t)x. Conversely, given an 
algebra as described above, we can define a ring homomorphism g : R — > G 
by setting g(t) := tl, where 1 denotes the multiplicative identity in G. 

For our purposes, polynomial algebras are the most important examples. 
If V is some set of symbols, then all polynomials with real coefficients and 
with variables drawn from V form a commutative algebra, written R[V]. 
The dimension of this algebra, as a linear space, is infinite whenever V is 
nonempty, since we can form polynomials of arbitrarily high degree. In this 
monograph, the set V will usually be finite; but the polynomial algebra R[V] 
makes sense even when V is infinite. Keep in mind, though, that any single 
polynomial is a sum of finitely many terms, each of finite total degree, and 
hence any single polynomial involves only finitely many variables. 

An algebra is graded when it is expressed as a linear-space direct sum 
G = © n>0 G n in such a way that the ring multiplication takes Gi x Gj into 
Gi+j, for all nonnegative % and j. The linear subspace G n is called the n th 
graded slice of the algebra G, and the elements of G n are called homogeneous 
of grade n or of degree n. Every element a; of a graded algebra can be written 
uniquely as a sum x = ^2 n>0 (x) n °f ^ s 9 ra ded components, where the n th 
graded component (x) n is homogeneous of grade n and where only finitely 
many of the components are nonzero. 

The key example of a graded algebra, for our purposes, is the polynomial 
algebra R[V], graded by total degree. Let's denote by R n [V] the linear space 
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of all polynomials that are homogeneous of total degree n in the variables 
in V. That space is the n th graded slice of the polynomial algebra R[V] 
(and hence might also be denoted R[V] n ). Note that multiplication maps 
Rj[V] x Rj[V"] into Rj +J -[V]. We can group the terms of any polynomial / by 
their total degree and hence decompose / uniquely as the sum / = Yl n >o(f)n 
of its graded components. If the number of variables v := |V| is finite, then 
each graded slice of the polynomial algebra R[V] is finite-dimensional; in 
fact, by the formula for choosing with repetition, we have 



An algebra homomorphism is a linear map that is also a ring homomor- 
phism. Thus, an algebra homomorphism / ': G — > H must satisfy f(x + y) = 



f(x)+f(y), f(tx) = tf(x), f(xy) = f(x)f(y), and /(l) = 1, for all elements 



x and y of G and all scalars t. 

Exercise 2.4-2 If g: R — > G and h: R — > H are the ring homomorphisms 
that describe two algebras G and H as in Exercise 2.4-1, show that an algebra 
homomorphism / : G — > H is the same thing as a ring homomorphism that 
preserves scalars, in the sense that / o g — h. 
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Chapter 3 

The Nested-Spaces Framework 



The paired algebras are the cornerstones of a new framework for studying 
polynomial functions in CAGD. Before we construct that new framework, 
however, we should discuss the frameworks that are currently used for this 
purpose — of which there are two. In the first of those existing frameworks, 
the one that underlies the thinking of a high-school student, the real- valued, 
polynomial functions of various degrees defined on a common affine domain 
space form a nested family of linear spaces. In this chapter, we review that 
framework, which we'll call the nested-spaces framework. 

As we consider various frameworks for CAGD, we shall use cubic Bezier 
triangles as our motivating problem. So suppose that we want to design a 
surface in a 3-dimensional object space O, with (x,y,z) coordinates. We 
divide that surface into triangular patches, and we specify each such patch 
parametrically as follows: We invent an affine parameter plane A, with (u, v) 
coordinates, and we define the x, y, and z coordinates of the patch to be 
real- valued, polynomial functions on A of total degree at most 3 in u and v. 
Any framework for CAGD that we consider must provide the linear spaces 
that are appropriate for studying a problem of that sort. 

3.1 Choosing a Cartesian coordinate system 

In the nested-spaces framework, we begin by setting up a Cartesian coordi- 
nate system in the affine parameter plane A. That is, we choose some point 
C in A to act as the center of our Cartesian coordinate grid, and we choose 
two vectors ip and ip over A to be the unit vectors in the u and v directions. 
Each point P in A can then be uniquely expressed in the form 

(3.1-1) P = C + u(P)<p + v(P)i/j, 

for certain real numbers u{P) and v(P), called the Cartesian coordinates of 
P. Note that u itself is an affine function u: A — > R, and the same for v. 
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We intend to use, as the x, y, and z coordinates of our surface patch, 
polynomial functions / : A — > R of total degree at most 3. The set of all such 
cubic functions forms a linear space of dimension 10, and we often adopt, as 
our basis for that linear space, the ten functions defined by the monomials 
u l v\ for i+j < 3. Any cubic function /: A — > R can be uniquely expressed 
as a linear combination of the ten functions in that power basis: 

(3.1-2) /= f 30 u 3 +f 21 u 2 v +f 12 uv 2 +f 03 v 3 

+ f20U 2 +fuUV + f 02 V 2 
+fwu +foiV 

+/oo- 

In general, let's denote by Poly <n (A, R) the linear space of all functions 
/: A — > R that can be defined by polynomials of degree at most n in the 
variables u and v. Note that, if we adopted some other Cartesian coordinate 
system (V, v') for the plane A, based on an origin point C and unit vectors 
<// and ip', the two systems (u,v) and (u',v r ) would be aflinely related, so we 
would end up with the same space of functions Po\j <n (A, R). 

3.2 Picturing the nested-spaces framework 

Figure 3.1 depicts the nested-spaces framework graphically. On the left, we 
have the affine parameter plane A, all by itself. Each of the other shapes 
represents a linear space of interest in CAGD. In the infinite nest of trian- 
gles, the n th triangle represents the linear space Poly <n (A, R). The space 
Poly <0 (A, R) of constant, real-valued functions on A has the constant func- 
tion 1 as a basis. The space Poly <1 (A, R) of affine, real- valued functions on 
A has the three functions u, v, and 1 as a basis. The space Poly <2 (A, R) has 
the six functions u 2 , uv, v 2 , u, v, and 1 as a basis. And so forth; in general, 
we have dim(Poly<„(A, R)) = (^ 2 ) = ("+ 2 ). 

The union lJn>o P°ly<n(A R) °f ^ ne nested triangles is an algebra, which 
we shall denote Poly (A, R): the algebra of all real- valued functions on A that 
can be defined by polynomials in u and v of any degree. By mapping each 
such function to its defining polynomial, we see that the algebra of functions 
Poly (A, R) is isomorphic to the polynomial algebra R[w,t>]. 

Warning: The polynomial algebra R[w, v] is graded by total degree; but 
the algebra Poly(A, R) of polynomial functions has no natural grading. For 
example, it makes sense to distinguish those polynomials in R[w, v] that are 
homogeneous of total degree 3, that is, the linear combinations of u 3 , u 2 v, 
uv 2 , and v 3 . Indeed, that 4-dimensional linear space of homogeneous cubics 
is precisely R3[w, v], the third graded slice of the algebra R[w,t>]. But it 
wouldn't make sense to distinguish the corresponding functions in the algebra 
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Figure 3.1: The nested-spaces framework 



Poly(A,R). The functions in Poly(A,R) whose defining polynomials lie in 
R,3[-u,t>] are the ones that vary as a cubic function of the distance from the 
point C, that is, the functions / that satisfy f(C + t(P — C)) = t 3 f(P), for 
all points P and real numbers t. But the center point C of our coordinate 
system for the affine plane A was an arbitrary choice — that's part of what 
it means for the plane A to be affine. Thus, requiring a function / : A — > R 
to be a homogeneous cubic doesn't make sense, since the affine plane A has 
no preferred origin to be homogeneous around. 

3.3 The dual spaces 

In the nested-spaces framework, the linear spaces Poly <n (A,R) are thought 
of as primal; that is, the symmetry discussed in Section 2.3 is broken in the 
direction that views a polynomial function on A to be a primal object. The 
duals of those primal spaces are shown in Figure 3.1 as rounded rectangles, 
each linked by a double-headed arrow to its primal partner. 

Consider the space Poly <3 (A, R)*, for example. An element a of this 
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space is a dual functional denned on cubic functions, that is, it is a linear 
map a: Poly <3 (A, R) — > R. If we adopt the power basis {u l v^) i+ j< 3 for the 
space Poly <3 (A,R) of cubic functions, it is natural to adopt the dual basis 
for the space Poly <3 (A, R)*. That dual basis consists of the ten functionals 
( T ij)i+j<3 determined by the duality constraints 

. j. I 1 if i = k and j = I 

Tij{u k v l ) = < 

I 0 otherwise. 

Given any cubic function / in Poly <3 (A,R), we can use the functionals in 
this dual basis to compute the coefficients that are needed to expand / in 
the power basis; that is, Equation 3.1-2 holds just when the ten coefficients 
{fij)i+j<3 are determined by the equations = Tij(f). 

Of course, since duality is symmetric, the same holds the other way 
around. An arbitrary dual functional a in the space Poly <3 (74, R)* can be 
uniquely expressed as a linear combination of the elements of the dual basis, 

(3.3-1) a= a 30 r 30 +a 21 r 21 +(r 12 T U +O"o3^"03 

+O2o7"20 +<7iiTn +(r 02 r 02 

+O"l0Ti0 +O"0l701 

+°"oo T bO) 

and the ten coefficients {(Jij)i+j<z in this expansion are given by cr^ = a{u l v^). 



3.4 Interpreting elements of the dual spaces 

Of the ten dual functionals in our basis ijij) i+ j< 3 for the space Poly <3 (A, R)*, 
one has a particularly simple interpretation. Evaluating Equation 3.1-2 at 
the center point C, the point in the domain plane A with coordinates u(C) = 
v(C) = 0, we see that f{C) = f 00 = r 00 (f). So the functional t 00 evaluates 
its argument at the center point C. 

Evaluation at any fixed point constitutes a dual functional. If we evaluate 
Equation 3.1-2 at P = C + u(P)ip + v(P)ip, we find that 

f(P) = f 30 U(P) 3 + f 21 U{P)\{P) + f 12 U(P)V(P) 2 + / 03 V(P) 3 
+ f 20 u{Pf + fu U(P)V(P) + / 02 v{Pf 

+ j io u(P) + f 01 v(P) 
+ /oo- 
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It follows that evaluation at the point P is the dual functional ep given by 

e P = u (P f r 30 + u{P) 2 v{P) 721 + u(P)v(P) 2 r 12 + v(P) 3 r 03 
+ u(P) 2 r 20 + u(P)v(P) T U + v(P) 2 T 02 
+ u(P) no + v(P) r 0 i 

Typically, though, when we expand a dual functional in terms of our cho- 
sen basis (Tij)i+j<3, the expansion won't have that special form for any two 
scalars u(P) and v(P). Thus, a typical dual functional does not correspond 
to evaluation at any point. We can express any dual functional as a linear 
combination of point evaluations (as follows from Lemma 7.2-2). Alterna- 
tively, we can express any dual functional as a certain differential operator 
(as discussed in Section 7.10). But let's put those topics aside for now. 

3.5 Are the dual spaces nested? 

The primal spaces Poly <ri (A, R), the triangles in Figure 3.1, are nested. It 
would be nice if the dual spaces Poly <n (A,R)* were also nested; so it is 
important to thoroughly understand that they are not. They would be if the 
concept "subspace" in linear algebra were self-dual. But the concept that is 
dual to "subspace" is "quotient space": When X and Y are linear spaces, 
X is a subspace of Y if and only if X* is a quotient space of Y*. This is 
standard linear algebra, but we review it here for completeness. 

Consider the space Poly <2 (^4, R) of quadratic functions on A, sitting as 
a subspace inside the space Poly <3 (A,R) of cubic functions on A. We have 
dim(Poly <2 (A, R)) = 6, while dim(Poly <3 (A, R)) = 10. Suppose that we 
expand a cubic function / in terms of the power basis {u l v^)i + j<^ as shown 
in Equation 3.1-2. The function / will also be quadratic — that is, will lie in 
the subspace Poly <2 (A,R) — just when the four cubic coefficients / 30 , /21, 
f'12, and / 03 are all zero. 

What happens in the dual spaces? Because we have singled out the 
subspace Poly <2 (A, R) of the primal space Poly <3 (A,R), there is a certain 
subset of the dual space Poly <3 (A,R)* that we can single out in a natu- 
ral way: the annihilator of Poly <2 (A, R), written Ann(Poly <2 (A, R)). A 
dual functional a in Poly <3 (A, R)* belongs to Ann(Poly <2 (A, R)) just when 
a(f) = 0, for all / in Poly <2 (A, R). Unfortunately, there is no hope that this 
annihilator subspace will coincide with or can even somehow represent the 
smaller dual space Poly <2 (A, R)*, since the dimensions are wrong. We have 
dim(Poly <2 (A, R)*) = 6; but a dual functional annihilates Poly <2 (A, R) just 
when it can be written as a linear combination of the four functionals r 3 o, 
72i, T12, and t 03 , so dim(Ann(Poly <2 (A))) = 4. Here is what is true instead: 
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The smaller dual space Poly <2 (A, R)* is isomorphic in a natural way to the 
quotient space Poly <3 (A, R)*/ Ann(Poly <2 (A, R)). But there is no natural 
way to single out one 6-dimensional subspace of Poly <3 (A,R)* to represent 
Poly <2 (A, R)*; that is, singling out one such subspace would involve making 
an arbitrary choice. 

As it happens, we have already made an adequate arbitrary choice: our 
choice of the point C as the center of our Cartesian coordinate system in 
the plane A. As we discussed at the end of Section 3.2, the choice of C 
determines a 4-dimensional subspace of Poly <3 (A, R): the functions given 
by polynomials that are homogeneous of degree 3 in u and v, that is, the 
functions that are homogeneous cubics around C. Call that space He- The 
annihilator Ann(Hc) is a 6-dimensional subspace of Poly <3 (A, R)* that we 
could use to represent Poly <2 (A, R)*. But we want the structures in our 
frameworks to be independent of the coordinate system that we choose for 
the affine space A, so this path to nested dual spaces is closed to us. 



Chapter 4 

The Homogenized Framework 



In the nested-spaces framework, the primal spaces Poly <n (A, R) are nested, 
but the dual spaces Poly <n (A, R)* are not. Since duality is a symmetric 
relationship, that lack of symmetry constitutes a flaw. Furthermore, forcing 
the dual spaces to be nested as well would involve making arbitrary choices, 
as we discussed in Section 3.5; so we aren't willing to repair the flaw that 
way. The only other way to repair the flaw is to eliminate the nesting of the 
primal spaces. Fortunately, we can eliminate the primal nesting by making 
a simple change in our framework, to wit, by homogenizing. Indeed, if we 
adopt Bernstein bases for our primal spaces, rather than power bases, this 
homogenization happens automatically. The resulting homogenized frame- 
work is the framework for studying polynomial functions that underlies most 
current research in CAGD. 



4.1 To n- forms via barycentric coordinates 

As an easy introduction to the homogenized framework, let's study cubic 
Bezier triangles once again, but using a Bernstein basis, rather than a power 
basis. Let AQRS be a reference triangle in the parameter plane A. Any 
point P in A can be uniquely represented as an affine combination of the 
three vertices Q, R, and S: 

(4.1-1) P = q(P)Q + riP)R + s(P)S where q(P) + r(P) + s(P) = 1. 

The numbers (q(P),r(P), s(P)) are called the barycentric coordinates of P, 
while the triple of affine functions (q, r, s) is called a barycentric coordinate 
system for the plane A. Every cubic function /: A — > R can be uniquely 
expressed as a homogeneous cubic polynomial in the variables q, r, and s. 
The Bernstein basis for the space of such functions consists of the ten cubic 
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monomials in q, r and s, scaled by trinomial coefficients: 

\VjkJ J i+j+k=3 

Every cubic function / : A — > R can be expanded uniquely as a linear com- 
bination of those basis functions, 

(4.1-2) /= / 3 oog 3 

+/210 3g 2 r + / 20 i 3q 2 s 
+/120 3gr 2 + f ul 6qrs + f W2 Sqs 2 
+/030?" 3 +/o2i3r 2 s + /oi 2 3rs 2 + /003S 3 , 

and the coefficients (fijk)i+j+k=3 in this expansion are known as the Bezier 
ordinates of the function /. 

In general, let's denote by Poly n (A, R) the linear space of all functions 
that can be defined by polynomials that are homogeneous of degree n in the 
variables q, r, and s. Such a function is called an n-form on the plane A. 
Note that an n-form has a well-defined value for any triple of scalars (q, r, s), 
even when the sum q + r + s differs from 1. As we discuss shortly, it is for that 
reason that we write Poly n (A, R), with a hat accent over the A. (Indeed, 
it wouldn't make sense to write "Poly n (A, R)" , without the hat accent. As 
we discussed in Section 3.2, the affine space A has no preferred center point 
around which to require a polynomial function to be homogeneous.) 

Have we pulled apart the nested spaces? The nesting arose because we 
considered the constant function 1, for example, to be a function of degree 
at most n, for every nonnegative n. But the single function 1 has now given 
rise to an infinite sequence of distinct forms: the constant form 1, the linear 
form q + r + s, the quadratic form (q + r + s) 2 , and so forth. In this way, we 
have converted the nested spaces Poly <0 (A,R) C Poly <1 (A, R) C ... into 
disjoint spaces Poly 0 (A, R), Poly 1 (A, R), and so forth. (To be picky, those 
latter spaces are only almost disjoint: They share a common origin, since the 
zero function on A is an n-form for every n > 0.) 

4.2 To n- forms via a weight coordinate 

While Bernstein bases lead naturally to the homogenized framework, we can 
start with power bases and still end up homogenized as follows. Given the 
center point C and the unit vectors ip and ip in the plane A, we write each 
point P in A in the form 



(4.2-1) P = w(P)C + u(P)<p + v(P)i> where w(P) = 1. 
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Figure 4.1: The linearization A of the affine plane A 



Like Equation 4.1-1, but unlike Equation 3.1-1, this equation gives each 
point P three coordinates, subject to one affine constraint: the coordinates 
(w(P),u(P),v(P)), subject to the constraint w(P) = 1. Any polynomial 
of total degree at most n in the variables u and v can be converted into 
an equivalent polynomial that is homogeneous of degree n in the variables 
w, u, and v simply by adding factors of w to each term as appropriate, 
the term tu l v^ becoming tw n ~ % ~^u l vK This process is called homogenization. 
Homogenizing each function in the space Poly <n (A, R) leads to the same 

linear space Poly n (A, R) of n- forms that we arrived at via the Bernstein 
basis, since the coordinate systems (q,r,s) and (w,u,v) are linearly related. 

4.3 The linearization of an affine space 

But wait a minute: What space is it that has (g, r, s) and (w, u, v) as two 
possible coordinate systems? Well, it is surely a 3-dimensional space, and 
we want it be linear, rather than merely affine. It includes the plane A as 
an affine hyperplane — to wit, the hyperplane q + r + s = 1 or, equivalently, 
w — 1. Those properties are enough to determine it uniquely, up to a unique 
isomorphism, as shown in Figure 4.1. It has various names and is written in 
various ways; let's call it the linearization of A and write it A. 

Here is another way to describe how homogenization works. We start 
with polynomial functions of degree at most n, defined on an affine space A. 
We could choose some point in A, such as C, to act something like an origin. 
But different n-ic functions are homogeneous around different points in A, 
and we don't want to play favorites; worse yet, many n-ic functions aren't 
homogeneous around any point in A. Instead, we adjoin to A an origin that 
lies outside of A, a common origin that all n-ics can be homogeneous around. 
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The linear span of A with respect to this new, exterior origin is a linear space 
A, with dim(v4) = dim(A) + 1. Any polynomial function / : A — > R of degree 
at most n extends uniquely to a function /: A — > R that is homogeneous 
of degree n — that is, extends to an ra-form. To effect that extension, let 
w: A — > R be the unique linear functional on A that takes the constant value 
1 on the hyperplane A, so A = u> _1 (l); a value of this functional w is often 
called a weight. To compute /, we take the polynomial that defines / and we 
add factors of the weight functional w to each term, as needed, to bring that 
term up to the proper total degree. Going back from / to / is even easier: 
We simply substitute w :— 1. As a result, we can treat the function / and 
the n-form / as two aspects of the same underlying reality. 

In the particular case n — 1, the afline function q on A extends uniquely 
to a 1-form on A, that is, to a linear functional on A. For simplicity, we shall 
use the same symbol q to denote that linear functional, rather than writing 
q. The same goes for the functionals r, s, w, u, and v; indeed, we wrote w 
already in the last paragraph, rather than w. In this way, each of the triples 
(q, r, s) and (w, u, v) now constitutes a linear coordinate system on the linear 
space A, and those two systems are related by some invertible 3-by-3 matrix. 

Marcel Berger [3] gives a thorough explanation of linearization. He (or 
perhaps his translator, Silvio Levy) refers to the space A as the universal 
space of A, since A satisfies a certain universal mapping condition, as we 
discuss in Section 9.1. But lots of things satisfy universal mapping conditions; 
it seems more specific to refer to A as the linearization of A. 

4.4 A new term: "anchor" 

Linearization is a central technique in CAGD, but it is not understood as 
clearly as it should be. One reason is people's reluctance to add one more 
dimension — especially to move from 3 dimensions, which they can visualize, 
to 4 dimensions, which they cannot. But a simpler stumbling block is the lack 
of good terminology. Given an afline space A sitting inside its linearization 
A, we want to reserve the term "point" for the elements of A, that is, the 
elements P of A that have weight 1, that satisfy w(P) = 1. In the same 
spirit, we want to reserve the term "vector" for the elements it of A that 
have weight 0. We then have the familiar equations "point — point = vector" 
and "point ± vector = point" . But what name should we use for an arbitrary 
element p of A! No good term has yet taken hold. Most authors use a phrase 
like "weighted point", "mass point", or "punctual mass". The justification 
for such names is that any element p of A whose weight is nonzero can be 
written as a scalar multiple of a point: We have p = w(p)(p/w(p)), where 
p/w{p) is a point because w(p/w(p)) = w{p)/w{p) = 1. But note that 
nonzero vectors over A are elements of the linearization A also, and they 
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can't be written as scalar multiples of points. On that basis, Fiorot and 
Jeannin [21] proposed the term "massic vector". 

But none of those phrases is adequate. We should dignify the elements 
of the linearization by giving them a single-word name. Here is my proposal: 
Given any afhne space A, let's refer to an element of its linearization A as 
an anchor over A. So a point in A is an anchor over A of weight 1, while 
a vector over A is an anchor over A of weight 0. Every anchor is either a 
vector or a scalar multiple of a point. 

(In defense of the word "anchor" , it connotes a fixed point and something 
weighty both of which are appropriate. Indeed, control points in computer 
drawing systems are sometimes called "anchors". Also, there is no estab- 
lished mathematical meaning of "anchor" with which this new sense might 
be confused. Finally, it is quite convenient that the noun "anchor" has two 
syllables and ends in "-or", like "vector" and "tensor".) 

Consider the domain plane A of a cubic Bezier triangle F: A — > O, as 
in our recurring example. Every anchor p over the plane A can be written 
uniquely as a linear combination 

(4.4-1) p = w{p)C + u{p)ip + v(p)ip, 

where we no longer place any constraint on the weight w(p). Equivalently, 
every anchor p can be written uniquely as a linear combination 

(4.4-2) p = q(p)Q + r{p)R + s{p)S, 

with no constraint on the sum q{p) + r{p) + s(p). 

4.5 Coanchors 

Now that an element of the linearized space A is an anchor over A, an element 
of the dual space A* — that is, a linear functional on anchors — is a coanchor 
on A (nothing to do with a co-anchor of a television newscast). In particular, 
the linear functionals q, r, s, w, u, and v are coanchors on A. The weight 
coanchor is the coanchor w = q+r + s that satisfies A = w _1 (l). A Cartesian 
coordinate system for A, such as (w,u,v), is a basis of A* that contains the 
weight coanchor as one basis element. A barycentric coordinate system, such 
as (q, r, s), is a basis of A* whose coanchors sum to the weight. 

Using Cartesian coordinates, every coanchor h on the plane A can be 
written uniquely as a linear combination of the coanchors w, u, and v: 

h = C(h)w + <p(h)u + ip(h)v 
= h(C)w + h(<p)u + h(ip)v 
= (h, C)w + (h, ip)u + (h, ip)v. 



38 



CHAPTER 4. THE HOMOGENIZED FRAMEWORK 



We wrote the right-hand side of that equation three times because there is 
an issue about how to write it. On the first line, we wrote the exact dual of 
Equation 4.4-1. The coefficients on that first line look strange because we 
aren't used to treating an anchor as a function that gets applied to a coanchor 
as its input datum. We typically prefer to break the symmetry in the opposite 
direction, treating the coanchor as the function and the anchor as the datum, 
as on the second line. Of course, the underlying reality is symmetric, as we 
discussed in Section 2.3: We are really pairing the coanchor with the anchor, 
however we choose to write it. 

In barycentric coordinates, the story is much the same. We can write any 
coanchor h on the plane A uniquely as a linear combination of the coanchors 
q, r, and s: 

h= (h,Q}q+(h,R}r + (h,S)s. 

The dual of a coordinate system is a reference frame. For example, the 
reference frame for the plane A that is dual to the Cartesian coordinate 
system (w, u, v) consists of the center point C and the unit vectors tp and tp. 
The three anchors (C, ip, ip) form a basis for the linear space A of anchors 
over A, and we have the duality constraints 
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In general, a Cartesian reference frame for an affine space is a basis for its 
linearization that is all vectors, except for a single point. 

The reference frame that is dual to the barycentric coordinate system 
(q, r, s) consists of the three points Q, R, and S. Those three points also 
form a basis for the linearization A, and they satisfy the duality constraints 





In general, a barycentric reference frame for an affine space is a basis for its 
linearization that consists entirely of points. 

A comment about notation: We are denoting the fundamental pairing 
between the linear space A of anchors and the linear space A* of coanchors 
as a function ( , ) : A* x A — > R. In particular, given an anchor p and a 
coanchor h, we shall pair them by writing (h,p), with h on the left and p 
on the right. We adopt that convention for two related reasons. First, when 
breaking the symmetry, people more often think of the coanchor h as the 
function and the anchor p as its input datum, and it is convenient to end up 
with the function on the left. Second, people typically represent an anchor 
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(or vector) in coordinates as a column of numbers, while they represent a 
coanchor (or covector) as a row of numbers; it is convenient to end up with 
the row to the left of the column, so that the dot product that effects the 
pairing follows the standard rules for matrix multiplication. Hence, we prefer 
to write our pairings with their arguments in the order (dual, primal) . 

4.6 The benefits of linearization 

There are many reasons why linearization and homogenization are beneficial 
in CAGD. While this monograph is not about linearization, let's pause to 
recall some of those benefits. 

• Linearization simplifies the algebra that underlies geometric operations. 
If P and Q are points in an afline space A, the points on the line 
joining them have the form (1 — t)P + tQ. Before we linearize, we must 
treat that entire afline combination as a single algebraic operation, 
since the individual summands (1 — t)P and tQ are not points. After 
linearizing, however, we recognize the summands as anchors. The linear 
space A of anchors is closed under addition and scalar multiplication as 
independent operations. The overall afline combination (1 — t)P + tQ 
denotes a point because its weight is 1; we calculate 

w((l - t)P + tQ) = (1 - t)w(P) + tw(Q) = (1 - t) + 1 = 1. 

• Linearization converts the geometric notions of collinearity, coplanarity, 
and the like into rank tests. Suppose that the afline space A is of 
dimension d. Given points Pq through Pj, in A, the coordinates of the 
corresponding anchors form a (k + l)-by-(d + 1) matrix. The points 
(Pq, . . . , Pk) are aflinely independent, spanning a flat of the maximum 
possible dimension k, just when that matrix has rank k + 1. 

• Linearization lets us encode an afline transformation as a single matrix. 
Before we linearize, we implement an afline transformation of an afline 
rf-space A as a linear transformation of A followed by a translation, 
that is, as a d-bj-d matrix together with a vector of length d. After 
linearizing, we instead use a single matrix of size (d + l)-by-(d + 1). 
Assuming Cartesian coordinate systems, that larger matrix is produced 
by pasting the vector onto the smaller matrix and then adding a new 
first (or last) column (or row) that is all zeros, except for a single 
one. Combining all of the data that describes an afline transformation 
into a single matrix in this way is particularly helpful when we want 
to compose afline transformations; after linearizing, we can compose 
afline transformations simply by multiplying their matrices. 
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• Linearization provides an elegant way to generalize from polynomial 
curves and surfaces to rational ones. To specify a cubic polynomial 
Bezier triangle in the affine object space O, we have been choosing 
three cubic forms on the affine parameter plane A to determine the 
x, y, and z coordinates of the surface. To allow that surface to be 
rational, it suffices to choose one additional cubic form, playing the 
role of a common denominator. An elegant way to achieve that effect 
is to draw a polynomial surface in the linearized object space O, which 
is a linear 4-space, say with coordinate system (wo, x, y, z). (We write 
the weight coanchor on O as wo only to distinguish it from the weight 
coanchor w = wa on A.) Projecting the resulting polynomial surface 
down into O from the origin of O gives us a rational surface in O. 
Thus, while a polynomial curve or surface has Bezier points, a rational 
Bezier curve or surface has Bezier anchors. Typically, those anchors 
are positive scalar multiples of points; but vectors and negative scalar 
multiples of points also make sense as Bezier anchors. 

• Linearization is the first step on the road to projective geometry. In 
projective geometry, we identify any two anchors that differ by a scalar 
multiple and we treat the resulting equivalence class, a line through 
the origin of the linearized space A, as a "point" in a new space: the 
projective closure of A. The coordinates of any nonzero anchor on 
such a line are homogeneous coordinates of the corresponding "point" 
in the projective closure. Each point in A, together with all of its scalar 
multiples, becomes a "finite point" in this projective closure. But the 
projective closure also contains "points at infinity", which are lines 
through the origin of A that consist entirely of vectors over A. We can 
then represent a projective transformation of a d-dimensional space A 
using a matrix of size (d+ l)-by-(d+ 1): the same matrix that we used 
above to encode an affine transformation, except with the constraint on 
the first (or last) column (or row) removed and with the understanding 
that matrices that differ by a scalar multiple are identified. 

While linearization is quite valuable, our goal is to take the next step, 
which is algebrization. Linearization embeds an affine space of points in a 
linear space of anchors, thereby defining addition and scalar multiplication 
as separate operations. Algebrization embeds that linear space, in turn, in 
an algebra of sites, thereby defining a new operation of multiplication. Each 
step along this road, from point to anchor to site, from affine space to linear 
space to algebra, brings us new tools to exploit in CAGD. 

It is not clear, by the way, that sites are the end of this road. Section 8.5 
speculates about taking one more step, from sites to locations, so that division 
by points will be legal, as well as multiplication. Thus, the road may go on: 
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Figure 4.2: The homogenized framework 

from point to anchor to site to location, and from affine space to linear space 
to algebra to field. 

4.7 The algebra of forms 

Figure 4.2 depicts the homogenized framework in the same schematic style 
in which Figure 3.1 depicted the nested-spaces framework. On the left, we 
have the linear 3-space A of anchors, with Cartesian basis (C, tp, ip) and with 
the domain space A sitting inside it as the affine plane w — 1. On the 
right, for each nonnegative n, we have a rounded rectangle representing the 
linear space Po\j n (A, R) of n-forms on A. Note that an n-form on A can be 
evaluated at any anchor over A, whether or not that anchor is a point, and 
hence n-forms on A have all of A as their domain. The space Poly 0 (A, R) of 
constant forms is essentially the real numbers, with (1) as its obvious basis. 
The space Poly 1 (A, R) of linear forms has (w,u,v) as one possible basis - 
so that space is the same as the space A* of coanchors on A. Indeed, for any 
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linear space X, we have Poly 1 (X, R) = Lin(X, R) = X*, since 1-forms are 
the same thing as linear functionals. Next come quadratic forms on A, cubic 
forms, and so forth, where dim(Poly n (A, R)) = dim(Poly< n (A, R)) = (™+ 2 )- 

Homogenization has pulled apart the primal spaces, so that they are no 
longer nested inside of one another; but they do fit together as the slices of a 
graded algebra. In particular, if / is an n-form on A and g is an m-form, the 
product fg is a form on A of degree n + m. The big triangle in Figure 4.2 
represents this algebra of forms, which we denote Poly(A, R). The algebra 
Poly(A, R) is essentially a polynomial algebra; using our Cartesian basis 
(w,u,v) for the space of coanchors A*, we can think of it as the polynomial 
algebra TL[w,u, v]. It would be equally valid to use some other basis for the 
space A*, such as the barycentric basis (q, r, s); but let's stick with our chosen 
Cartesian basis until Section 4.9. 

Note that the big triangle in Figure 4.2 is bigger than the union of the 
rectangles that it contains; the extra area represents inhomogeneous forms. 
If we add an m-form to an n-form, the sum is again a form on A, but it 
typically is not homogeneous. Such inhomogeneous forms don't seem to be 
good for anything, as far as CAGD is concerned; we shan't use them in this 
monograph. But they don't do much harm either. We view them as valid 
forms because we want the set of all forms on A to constitute an algebra - 
in particular, to be closed under addition. This problem of ending up with 
more primal objects than we really want did not arise in the nested-spaces 
framework; in that sense, inhomogeneous forms are a cost of homogenization. 
But the benefits of homogenization far outweigh its costs. 

4.8 The dual spaces 

For each nonnegative n, the linear space Poly n (/i,R) of n-forms on A has 
a dual Poly n (A, R)*, also shown as a rounded rectangle in Figure 4.2. As 
in the nested-spaces framework, the elements of the dual space Poly n (A, R)* 
are typically called dual functionals. Evaluation at a point is one flavor of 
dual functional, as is evaluation at a vector or evaluation at any anchor. But 
there are also many dual functionals that don't correspond to evaluation. 

The case n — 1 is special, since every dual functional on 1-forms does cor- 
respond to evaluation at a fixed anchor. Indeed, since the space Poly 1 (A, R) 
of 1-forms on A is the same as the space A* of coanchors on A, it follows that 
the dual space Poly 1 (A, R)* is simply A** = A. This is reflected in Figure 4.2 
by having the arrow leaving Pof^ (A, R) point back to the domain space A. 
We could make Figure 4.2 look more uniform if we moved the space A over 
to the right of the big triangle; but we leave A on the left, since that it where 
we are going to want it in later figures. 

This confusion about left versus right arises because of a confusion in the 
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homogenized framework about the direction in which to break the symmetry 
between primal and dual. Once the degree n exceeds 1, people using this 
framework typically think of an n-form in Poly n (A, R) as a primal object, 
suitable to be passed as an input datum to a dual functional in Poly n (A, R)*. 
In the special case n — 1, however, we surely want to think of a 1-form - 
that is, of a coanchor — as a dual object, since an anchor in A, of which a 
point in A is a special case, seems quintessential^ primal. 

4.9 Linearization revisited 

The key to the homogenized framework is the process of linearization, which 
takes an affine space A and constructs for us a naturally associated linear 
space A — in some sense, the free linear space generated by A. Let's take 
a moment to revisit how linearization works mathematically. There is an 
analogous, but less familiar, process of algebrization, which takes a linear 
space X and constructs for us a naturally associated commutative algebra 
Sym(X) — in some sense, the free commutative algebra generated by X. 
People in CAGD are already familiar with the algebra of forms, which is 
produced by algebrizing the linear space A* of coanchors. In this monograph, 
we also algebrize the linear space A of anchors, thereby producing the algebra 
of sites. We can give ourselves a leg up on understanding algebrization if we 
polish our understanding of linearization. 

In particular, we shall consider four approaches to linearization: fixing a 
frame, defining an equivalence relation, exploiting duality, and imposing a 
universal mapping condition. Those four approaches give us four different 
answers to the basic question, "What is an anchor?" 

4.9.1 Fixing a frame 

Let A be an affine space of dimension d that we want to linearize; that is, 
we want to construct the naturally associated linear space A. The simplest 
approach involves choosing one particular reference frame for A and letting 
our construction of the linear space A depend upon that choice of frame. 

For example, we might choose a Cartesian reference frame for A, say 
consisting of the point C in A and the d vectors (ip±, . . . , (p^) over A. Every 
point P in A can be written uniquely as P = C + Ui(P)ipi + • • • + u d (P)(p d , 
where the coefficients (ui(P), . . . , u d (P)) are the Cartesian coordinates of P. 
Having fixed that reference frame, we can construct the linearization A as the 
unique linear space that has (C, ipi, . . . , ipa) as a basis. An anchor over A is 
then, by definition, a linear combination p = w(p)C + ui(p)ipi + • • • + u c i(p)<Pd 
of those d + 1 basis elements. 

Of course, a barycentric reference frame would work just as well. Suppose 
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that the points (Rq, . . . , Rd) are the vertices of a nondegenerate (/-simplex in 
A, so that every point P in A can be written uniquely as an affine combination 
of the (Ri). Having fixed that frame, we could define the linearization A to 
be the unique linear space that has (i? 0 , • • • , Rd) as a basis. So an anchor, 
under this definition, is any linear combination p = r 0 (p)R 0 + ■ ■ ■ + ra(p)Rd 
of those d + 1 points. 

This process of fixing a frame is the simplest approach to linearization; 
it is both easy to understand and easy to prove theorems about. But it is 
unsettling to have the concept of an anchor over A appear to depend upon 
which reference frame for A we happen to have chosen. 

Vacant remark: An affine space of dimension 0 is a single point. We could 
stop there, but let's go one more step, calling the empty set the unique affine 
space of dimension — 1. There are a few anomalies in the case d = — 1, 
and we'll comment about them in paragraphs, like this one, labeled "Vacant 
remark"; feel free to skip them. 

One anomaly in the case d = —1 is that the empty affine space doesn't 
have any Cartesian reference frames; such a frame would have to consist 
of one point (of which there aren't any) and minus one vectors. But the 
empty affine space does have a barycentric reference frame, in fact, a unique 
one: the sequence with 0 points. The linearization of the empty affine space 
is the zero linear space, the space whose only element is 0. And the rule 
dim(A) = dim(A) + 1 holds also when A is empty. 

4.9.2 Defining an equivalence relation 

Mathematicians have a standard technique for avoiding choices such as the 
choice of a reference frame: They make all choices simultaneously and then 
use an equivalence relation to collapse out the superfluous structure that 
results. Here's how we would linearize using that technique. 

Given the affine space A, we first construct the unique linear space L(A) 
that has A itself as a basis. The space L(A) is huge — indeed, has uncount- 
able dimension, one dimension for each point P in A. All of those extra 
dimensions allow us to draw too many distinctions. For example, suppose 
that M := (P + Q)/2 is the midpoint of the segment from P to Q in A. The 
expression P/2 + Q/2 — M denotes a certain element of L(A): the unique 
element with coordinate 1/2 on the P axis, 1/2 on the Q axis, —1 on the 
M axis, and 0 on all other axes. We don't want any such element in the 
linearization A; more precisely, we want the expression P/2 + Q/2 — M to 
denote 0 in A — that's what it means to say that M = (P + Q)/2. More 
generally, for every way of expressing a point Q in A as an affine combination 
of other points, say Q — biP± + • • • + b m P m with b± + ■ • • + b m — 1, we want to 
have b\P\ + ■ ■ ■ + b m P m — Q = 0 in the linearization A. To achieve that, let 
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E(A) denote the smallest linear subspace of L(A) that contains all elements 
of the form b 1 P 1 + • • • + b m P m — Q, where Q = b 1 P 1 + • • • + b m P m in A. We 
then define the linearization A to be the quotient space L{A)/E{A). 

So, what is an anchor in this approach? An anchor over A is a huge 
equivalence class of linear combinations of points in A, under a certain equiv- 
alence relation. And how do we test whether two linear combinations, say 
61 Pi + • • • + b n P n and C\Q\ + • • • + c m Q m , are equivalent? We first choose 
some reference frame for the afline space A. Having chosen such a frame, 
we expand each point p and each point Qj as a linear combination of our 
frame elements. The two linear combinations of points that we started with 
are equivalent just when the two linear combinations of frame elements that 
result from this rewriting are equal. Note that we have to choose a frame in 
order to carry out this test, but the result of the test doesn't depend upon 
which frame we choose. 

The advantage of this approach, over the fixed-frame approach, is that it 
gives us a notion of "anchor" that is independent of reference frame. But we 
pay a high price in mathematical complexity for that frame-independence: 
taking a quotient of linear spaces of uncountable dimension. 

4.9.3 Exploiting duality 

Defining an equivalence relation, as above, is the standard way that a math- 
ematician would achieve frame-independence; but there are other ways. A 
more specialized trick that is available in this case exploits duality. 

Given the affine space A, consider the space AS(A, R) of all afline, real- 
valued maps on A. Since the space of real numbers R is linear, as well as 
affine, the space of maps AS(A, R) is also linear, with addition and scalar 
multiplication defined pointwise. Hence, it makes sense to talk about the 
dual space Aff(A, R)*. And that dual space turns out to be a perfectly 
fine model for the linearization A. Note that dim(Aff (A, R)) = d + 1, the 
extra +1 coming, in Cartesian coordinates, from the constant term. So 
dim(Aff(A, R)*) — d+1 also, which is what we want for the linearization A. 
We also want the affine space A to sit, in its linearization A, as a hyperplane 
not containing the origin. If we make the definition A := AS(A, R)*, then 
A won't actually be a subspace of its linearization A. But there will be a 
natural isomorphism from A to an affine hyperplane in A: the map that 
takes a point P in A to the evaluate-at-P functional ep, the second-order 
functional defined, for all / in AS(A, R), by e P (f) := f(P). 

The reason that this trick works is that there is a natural one-to-one 
correspondence between AS(A,H) and Lin(A, R). In the forward direction, 
that's just a special case of homogenization. Recall from Section 4.3 that 
a polynomial map /: A — > R of degree at most n extends uniquely to an 
n-form /: A — > R. Letting n — 1, we deduce that any / in A&(A, R) 
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extends uniquely to a 1-form / in Lin(A, R). The reverse direction is even 
easier: We produce / from / by restricting the domain from A to A. Since 
AS(A, R) « Lin(A, R) = A*, it then follows that AS (A, R)* « A** = A 

In this approach, an anchor over A is a linear functional on AS(A, R), 
which, given the one-to-one correspondence that we just discussed, is essen- 
tially the same thing as a linear functional on Lin(A, R) = A*. So an anchor 
is essentially a linear functional on coanchors! That is pleasantly symmetric, 
since a coanchor is, of course, precisely a linear functional on anchors. It 
is the qualifier "essentially" that allows the resulting pair of definitions to 
avoid circularity. Indeed, the real work of linearization is in showing that the 
spaces Aff (A, R) and Lin(A, R) are in natural one-to-one correspondence. 

Math remark: Exploiting duality in this way requires that A be finite- 
dimensional, since we need the isomorphism A** ps A. In contrast, the first 
two approaches work fine to linearize affine spaces even of infinite dimension. 

4.9.4 Imposing a universal mapping condition 

We've now seen three concrete constructions of anchors, one dependent on 
a choice of reference frame and the other two frame-independent. There 
are further possibilities. For example, Berger [3] gives a frame- independent 
construction with a geometric flavor, in which an anchor over A turns out to 
be a vector field on A of a certain type. 

Why don't multiple concrete constructions lead to chaos? Because we 
can characterize the linearization A abstractly, using a universal mapping 
condition. That condition does not determine the linearization uniquely, but 
does determine it up to a unique isomorphism. So any concrete construction 
must produce a result that is uniquely isomorphic to every other such result. 
We need to verify that one of the concrete constructions succeeds, in order to 
show that the universal mapping condition is satisfiable. But, once we have 
done that, it doesn't matter which concrete construction we employ, since 
they all give essentially the same result. 

We shall return to these issues in greater depth in Chapter 9. But here, 
in brief, is how to characterize the linearization abstractly. A linearization of 
an affine space A is a pair (X, i) consisting of a linear space X and an affine 
map i : A — * X that satisfies the following universal mapping condition: 

For every pair (Y,j) consisting of a linear space Y and an affine 
map j : A — > Y, there exists a unique linear map / : X — > Y with 
j = foi. 

As it turns out, this universal mapping condition can be satisfied. Choose, 
for the space X, some linear space with dim(X) = dim(A) + 1 and choose, for 
the map % : A — > X, some affine injection whose image, which will be an affine 
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hyperplane in X, does not include the origin. Given any pair (Y, j), the values 
of / on the image of % are determined by the equation j = f o i- and those 
values extend uniquely to a linear map /: X — > Y. Furthermore, we shall 
see in Section 9.1 that, whenever some two pairs (X 1 ,i 1 ) and (X 2 ,i2) both 
satisfy the universal mapping condition, there is a unique linear isomorphism 
between X\ and X 2 that makes this diagram commute: 



A 




Xi" -X 2 



Since any pair that satisfies the universal mapping condition is uniquely 
isomorphic to any other, we are justified in choosing any satisfying pair (X, i) 
that we like and referring to the space X in that pair as "the" linearization 
of A, denoting it A. The affine injection i: A — > A allows us to identify A 
with the image of i, which is an affine hyperplane in A not containing the 
origin. Thus, it is also safe for us to pretend that the linearization A of A 
actually includes A as a subset: the set of unit- weight anchors. 

4.9.5 So what is an anchor, really? 

With this universal mapping condition in mind, we can now give the truest 
and deepest answer to the question, "What is an anchor?" Answer: An 
anchor over A is an element of some concrete linearization of A, but with the 
understanding that, if two different linearizations of A ever get involved in 
the same argument, we are required to use the unique isomorphism between 
them to identify each element of one with the corresponding element of the 
other. That is, we agree not to distinguish between different linearizations. 
So all of our earlier answers were correct simultaneously. An anchor over 
A is a linear combination of (C, tpi, . . . , It's also a linear combination 
of (Rq, . . . , Rd). It's also a huge equivalence class of linear combinations of 
points of A. And it's a linear functional on coanchors, and it's a vector field 
of a certain type, and so on. Speaking loosely, an anchor over A is an element 
of "the" linearization A of A. 

Keep in mind that these same issues are going to arise again in defining 
sites. A site over A is, speaking loosely, an element of "the" algebrization 
Sym(A) of the linear space A of anchors. Given any linear space X, there is 
a universal mapping condition that determines when a commutative algebra 
is an algebrization of X. Since it follows from this condition that any two 
algebrizations of X are isomorphic in a unique way, we typically pretend that 
the algebrization Sym(X) is uniquely determined. 

In fact, the same issues arose already in defining forms, although we 
didn't comment about them at the time. The algebra of forms is, we claim, 
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Figure 4.3: The homogenized framework with abstract labels 

the algebrization Sym(A*) of the linear space A* of coanchors. That claim 
should be plausible, because a form is, roughly speaking, a polynomial whose 
variables are coanchors. We introduced the algebra of forms in Section 4.7 
as the algebra Poly(A, R) of all polynomial, real-valued functions on A. But 
that is simply one concrete construction of the abstract algebra Sym(A*). 
Indeed, for any linear space X, it turns out that we can exploit duality 
to construct the algebrization Sym(X) concretely as Poly(X*,R). So one 
concrete model for the algebra of forms Sym(A*) is the algebra of functions 
Poly(A**,R) = Poly (A, R). 

Figure 4.3 shows the homogenized framework again, just as in Figure 4.2, 
except that the spaces are now labeled abstractly. For example, the space 
of n-forms on A, which used to be labeled Poly n (A, R), is now labeled 
Sym n (A*). (Many authors write Sym n (X) for the n th graded slice of the 
algebra Sym(X), with the n as a superscript. We make the n a subscript 
just for consistency with the notations Poly n (A,R) and H n [w,u,v].) 



Chapter 5 

The Separate- Algebras 
Framework 

The homogenized framework has brought us closer to symmetry, in the sense 
that, in Figure 4.3, neither the primal spaces (Sym n (A*))„> 0 nor the dual 
spaces (Sym n (A*)*)„> 0 are nested. But we still haven't achieved symmetry. 
The primal spaces fit together to make up the algebra of forms, while each 
dual space stands alone. Our eventual goal is the paired-algebras framework, 
in which the dual spaces fit together, in similar way, to make up the algebra 
of sites. But it is going to take us two steps to get there. 

In the first of those two steps, we achieve symmetry in a brute-force 
way by treating the linear space A of anchors exactly as the homogenized 
framework treats the space A* of coanchors. The result is the separate- 
algebras framework, shown in Figure 5.1. This framework has the serious 
drawback that there are four linear spaces associated with each degree n, the 
space of n-forms Sym n (A*) and its dual Sym n (A*)* being joined by the space 
of n-sites Sym n (A) and its dual Sym n (A)*. 

In the second step, we shall choose a sequence of pairing maps, the n th of 
which pairs the space of n-forms with the space of n-sites, thereby allowing us 
to use each of those spaces to represent the dual of the other. This yields the 
paired-algebras framework, with just two spaces on the n th level once again, 
rather than four. The reason that we delay taking this second step until 
Chapter 7 is that it entails a contentious choice about an annoying factor of 
n\ . There are two sequences of pairing maps, in which the n th maps differ by 
a factor of n\ . Consider an n-form and an n-site, both of which happen to be 
perfect powers — say the n-form h n and the n-site p n , where h is a coanchor 
and p is an anchor. With one pairing, we have (h n ,p n ) = (h,p) n ; with the 
other, we have (h n ,p n ) = n! {h,p) n . Sad to say, adopting either convention 
leaves us with annoying factors in many of our formulas, as we discuss in 
Appendix B. For now, let's get as far as we can using the separate-algebras 
framework, before tackling the annoying n! . 
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Sites Forms 



Figure 5.1: The separate- algebras framework 



5.1. THE ALGEBRA OF SITES 
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5.1 The algebra of sites 

The triangle on the right in Figure 5.1 is the algebra of forms Sym(A*) = 
Poly (A, R), just as in the homogenized framework. In a completely symmet- 
ric way, the triangle on the left is the algebra Sym(A) = Poly (A*, R), which 
we christen the algebra of sites. 

This may be as good a time as any to discuss why I chose the word 
"site" . I wanted a noun that would accept a numeric prefix, so that I could 
talk about n-sites as being dual to n-forms; that strongly suggested a one- 
syllable noun. I also wanted a noun that means something like "point" . The 
nouns "place" and "site" met those criteria. Unfortunately, both of those 
words have preexisting meanings in algebraic geometry. A place on a curve 
is an equivalence class of irreducible parameterizations — roughly speaking, 
a point on a branch of the curve [1, 48]. From the Encyclopedic Dictionary 
of Mathematics [35], I learned that a site is a category in which each object 
comes equipped with a covering family of morphisms that fit together to 
form a Grothendieck topology. I hope that Grothendieck topologies are high- 
powered enough that no confusion will arise between that meaning of "site" 
and sites as the duals of forms. 

The equality Sym(A) = Poly (A*, R) suggests that a site is a real- valued, 
polynomial function on coanchors; and indeed, that is one of various equiva- 
lent ways to define a site. But viewing sites from that perspective is not the 
best way to get to know them. Keep in mind that, roughly speaking, sites 
are polynomials whose variables are anchors, just as forms are polynomials 
whose variables are coanchors. Let's refer to such polynomials as anchor 
polynomials and coanchor polynomials. 

For definiteness, let's assume once again that A is an affine plane, of 
dimension d = 2. What is a site over A, more precisely? For that matter, 
what is a form on Al Both questions have an abstract answer, given on the 
first line of Table 5.1, and a variety of concrete answers, three of which are 
given on the following lines. Table 5.2 shows the four different ways in which 
we shall denote the linear space of all n-sites over A and the linear space of 
all n-forms on A, corresponding to the four lines in Table 5.1. So each of the 
bottom three lines names a concrete construction for the linear space that 
the top line names abstractly. 

5.1.1 Imposing a universal mapping condition 

The linearization A of an affine space A is a linear space that satisfies a certain 
universal mapping condition. In a similar way, the algebrization Sym(X) of 
a linear space X is a commutative algebra that satisfies a universal mapping 
condition. We shall discuss that condition and related issues in Chapter 9. 
Until then, just keep in mind that there is an abstract characterization of 
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site form 



an element of the algebrization 
Sym(A) of the linear space A 


an element of the algebrization 
Sym(A*) of the linear space A* 


a polynomial in the anchor vari- 
ables (C, tp, ip) 


a polynomial in the coanchor vari- 
ables (w, u, v) 


an equivalence class of anchor poly- 
nomials whose variables are arbi- 
trary anchors 


an equivalence class of coanchor 
polynomials whose variables are ar- 
bitrary coanchors 


a real- valued function on coanchors 
that can be defined by some anchor 
polynomial — in fact, by an equiv- 
alence class of anchor polynomials 


a real-valued function on anchors 
that can be defined by some coanchor 
polynomial — in fact, by an equiva- 
lence class of coanchor polynomials 



Table 5.1: What are sites and forms? 



the algebra of sites that determines it up to a unique isomorphism. So which 
concrete construction we adopt for that algebra doesn't matter. 

5.1.2 Fixing a basis 

Given a linear space X of dimension k, the simplest way to construct the 
symmetric algebra Sym(X) is to fix a basis (£i, ...,£&) for X and then to 
construct Sym(X) as the algebra R[£i, ...,£&] of all polynomials in those k 
basis elements, treated as variables. Using this approach, we can construct 
the algebra of forms Sym(A*) as R[iu, u, v], and we can construct the algebra 
of sites Sym(A) as R[C, tp, tp]. While this approach is delightfully simple, it 
might seem to unfairly favor the fixed basis. 

5.1.3 Defining an equivalence relation 

We would prefer to use different bases at different times and, even better, to 
use multiple bases simultaneously. For example, we would like any coanchor 
polynomial to denote a form, even if its variables don't all come from any 
single basis for the space A* of coanchors. Once we allow coanchors that 
are linearly dependent, however, distinct polynomials may denote the same 
form. For example, the linear dependence q + r + s = w tells us that the two 
polynomials q + r + s and w denote the same 1-form — to wit, the weight 
coanchor. It follows that the quadratic polynomials qu+ru + su = (q+r + s)u 
and wu must denote the same 2-form. Thus, we can think of a form as an 
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space of n-sites space of n-forms 



Sym n (A) 


Sym n (A*) 


R n [C,(p,ip] 


R n [w,u,v] 


Rn[A}/ * A 


Rn[A*}/ 


Poly n (A*,R) 


Poly„(i,R) 



Table 5.2: Formulas for the spaces of n-sites and n-forms 

equivalence class of coanchor polynomials. 

Given two coanchor polynomials, elements of the huge algebra R[A*], 
how do we test whether they are equivalent? Answer: We rewrite all of the 
coanchors in both of them as linear combinations of w, u, and v and check 
whether the rewritten polynomials coincide. Of course, there is nothing 
special about the basis (w, u, v); we can adopt any basis for A* in performing 
this equivalence test without affecting the result. 

The same goes for sites. We would like any anchor polynomial in R[A] to 
denote a site, even if its anchors don't all come from any single basis for A. 
But once we allow anchors that are linearly dependent, distinct polynomials 
may denote the same site. For example, let E := (Q + R + S)/3 be the 
centroid of the reference triangle AQRS. The linear polynomials 3E and 
Q + R + S denote the same 1-site — an anchor of weight 3. It follows that 
the quadratic polynomials 3Eip and Qip + Rip + Sip = (Q + R + S)ip denote 
the same 2-site. We can think of a site as an equivalence class of anchor 
polynomials, where two such polynomials are equivalent when rewriting all 
of the anchors in both of them as linear combinations of the anchors in a 
common basis for A would make them coincide; and which common basis we 
adopt in this test of equivalence doesn't matter. 

5.1.4 Exploiting duality 

One thing that you can do with a coanchor polynomial is to define a real- 
valued function on anchors; for example, the coanchor polynomial wu de- 
fines the function that takes an anchor p to the real number w(p)u(p) = 
(w,p)(u,p). As it happens, we are quite interested in real- valued functions 
on anchors, since we intend to use three of them to define the x, y, and z coor- 
dinates of our Bezier triangle. Note that the coanchor polynomial qu+ru+su 
defines the same real- valued function as does wu; so each real-valued function 
actually arises from some equivalence class of coanchor polynomials. In fact, 
these equivalence classes are the same ones that we introduced in the third 
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row of Table 5.1. Thus, we can think of a form either as an equivalence class 
of coanchor polynomials or as the common real-valued function on anchors 
that any one of those equivalent polynomials defines. 

The same goes for sites, except that CAGD doesn't give us any particu- 
lar reason to be interested in the resulting real-valued functions. An anchor 
polynomial defines a real-valued function on coanchors; for example, the an- 
chor polynomial 3Eip defines the function that takes a coanchor h to the real 
number 3E(h)ijj(h) = 3h(E)h(ip) = 3{h, E){h, ip). Two anchor polynomials 
define the same real- valued function just when they are equivalent, in the 
sense of the third row of Table 5.1. For example, the equivalent polynomials 
3Eip and Qip + Rip + Sip define the same real- valued function. Thus, we can 
think of a site either as an equivalence class of anchor polynomials or as the 
common real-valued function on coanchors that any one of those equivalent 
polynomials defines. 

Mathematically, forms and sites are completely symmetric; but their ap- 
plications to CAGD are not. Since CAGD gives us good uses for real- valued 
functions on anchors, the definition of forms in the fourth row seems more 
attractive than the one in the third row. Indeed, when we first defined forms 
in Chapter 4, we talked only about real-valued functions on anchors, leaving 
implicit the equivalence relation on coanchor polynomials. But CAGD does 
not give us similarly good uses for real-valued functions on coanchors. Hence, 
in defining sites, the third row seems more attractive than the fourth. 

Math remark: Why does it work to algebrize by exploiting duality, that is, to 
construct Sym(X) as Poly(X*, R)? It works because, over the real numbers, 
the coefficients of a polynomial are uniquely determined by that polynomial's 
values. The same works over any infinite field, even infinite fields of prime 
characteristic. But not over finite fields. Let p be a prime. Over the Galois 
field of order p k , the two polynomials £ p and £ have all of the same values; 
thus, two distinct elements of Sym(X) would be indistinguishable as elements 
of PolypC*, R). For more on this, see Section C.2.3. 

5.2 The weight of a site 

While we are not typically interested in sites as real-valued functions on 
coanchors — they are going to be useful to us for other reasons — there 
is one coanchor at which we do want to evaluate our sites: w, the weight 
coanchor. If s is any site over A, we define the real number s(w) to be the 
weight of s. If s is a 1-site over A, that is, if s = p is an anchor, we have 
s(w) = p(w) = w(p) = (w,p), so this definition agrees with our former notion 
for the weight of an anchor. More generally, suppose that we are given an 
n-site s explicitly, in terms of our Cartesian basis, as a linear combination of 
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the n-sites (C n " ! " J 'p) l ^)i + j< n . Since C is a point, of weight 1, while ip and 
ip are vectors, of weight 0, the weight of s is simply the coefficient of C n in 
this expansion. 

Exercise 5.2-1 If we think of a real number t as a 0-site over A, what is its 
weight t(w)7 (Answer: Working from the fourth row of Table 5.1, a 0-site 
is a real-valued function on coanchors that is homogeneous of degree 0, that 
is, a constant function. So t{h) = t for all coanchors h, including t(w) = t. 
Working from the second row gives the same result: The coefficient of C° in 
the expansion t = tC° is t.) 

Now that we have defined a notion of weight for n-sites, let's introduce 
the notation Sym n (A)^ for the affine space of all n-sites over A that are of 
unit weight. Thus, if we construct our sites by fixing the Cartesian reference 
frame (C, <p, ip), then the affine subspace Sym n (A)^ = R n [C, ip, ipY consists of 
all polynomials that are homogeneous of degree n in C, p>, and ip and in which 
the coefficient of the term C n is 1. The intuition behind the downward arrow 
is that restricting to unit weight is like undoing linearization. If linearization 
lifts us from the affine space A to the linear space A = A\ with its associated 
weight functional, then restricting to unit weight takes us back again: A^ = 
All _ j± Note that it makes sense to restrict to unit weight only when the 
context makes clear which weight functional is intended; it wouldn't make 
sense to write Sym n (A*)^, for example, since we haven't defined a notion of 
weight for n-forms. 

5.3 The dual spaces 

Each graded slice of the algebra of forms is a linear space, which has a dual, 
and the same is true for the algebra of sites. Thus, Figure 5.1 has four linear 
spaces on the n th level, for n^l. But the case n = 1 is special. The linear 
space Sym 1 (A) = Poly 1 (/l*,R) of 1-sites is simply the space A = A** of 
anchors, just as the linear space Sym 1 (yl*) = Poly 1 (A, R) of 1-forms is the 
space A* of coanchors. The two spaces A and A* formed a dual pair already 
in the homogenized framework, and we have no reason to break up that 
pairing now. Thus, there are only two spaces on the first level in Figure 5.1, 
and the double-headed arrow on that level simply links those spaces to each 
other. When we are ready to tackle the annoying n! in Chapter 7, we'll be 
able to cut back to only two spaces on every level. 

While the separate-algebras framework of Figure 5.1 has twice as many 
spaces as it should, at least it comes close to achieving perfect symmetry be- 
tween forms and sites. Indeed, had we started with an arbitrary dual pair of 
linear spaces (X, Y) in building the algebras of forms and sites, Sym(X) and 
Sym(F), the symmetry would have been perfect. But the spaces in the pair 



56 



CHAPTER 5. THE SEPARATE-ALGEBRAS FRAMEWORK 



(A*, A) are not arbitrary. In the space A of anchors, we have distinguished 
the hyperplane A of points as being of special interest. There is no analogous 
hyperplane in the space A* of coanchors; instead, it is a particular coanchor, 
the weight coanchor w, that is distinguished by the equation A = u> _1 (l). 
Those two distinguished structures, each of which determines the other, are 
the sole source of mathematical asymmetry in the separate-algebras frame- 
work. But keep in mind that there is also some motivational asymmetry; for 
example, we have practical applications in mind for real- valued functions on 
anchors, but none for real-valued functions on coanchors. 

5.4 Flavors of evaluation 

Before we leave Figure 5.1, let's review the three flavors of evaluation that 
we have defined. Consider a site-like object S and a form-like object T . In 
what situations can we evaluate one of them at the other? 

S T 

anchor coanchor 
anchor form 
site coanchor 

All three of our evaluations have, at their core, the pairing between the 
fundamental spaces A and A*, between anchors and coanchors. Given any 
anchor p over A and coanchor h on A, we can view pairing the two of them as 
evaluating either of them at the other, depending upon how we are breaking 
the symmetry at the moment: p(h) = h(p) = (h,p). 

Let p and q be anchors, while g and h are coanchors. In building the 
algebra of forms, we learned how to evaluate a form at an anchor. For 
example, we have (g 3 + 5gh)(p) = g{p) 3 + hg{p)h{p). In building the algebra 
of sites, we learned how to evaluate a site at a coanchor in a completely 
analogous way, modulo a little notational confusion. For example, we have 
{p 3 + 5pq)(h) = p{h) 3 + 5p(h)q(h) — though we might prefer to write that 
value as h(p) 3 + hh(p)h(q). 

Later on, we shall study various other ways in which to combine forms 
and sites: pairing an n-form with an n-site to produce a scalar, contracting 
an n-form on a fc-site to produce an (n — fc)-form, and so on. But we shall 
reserve the term "evaluation" for these three flavors of combination. 



Chapter 6 

The Veronese Prototypes 



Let A be an affine parameter space of dimension d. We have constructed 
the algebra Sym(A) of sites over A, in parallel with the well-known algebra 
Sym(A*) of forms on A. In Chapter 7, we are going to pair up, for each 
n, the linear spaces Sym n (A) and Sjm n (A*) of n-sites and n-forms, so that 
each of them can represent the dual of the other. But the algebra of sites 
has important applications in CAGD, even without those pairing maps. 

The key to those applications is the set of n-sites that are perfect n th 
powers. The linear space Sym n (A) of n-sites over A has dimension ( n+d ). 
Inside that big space, we focus on those n-sites s that have the special form 
s = P n , for some point P in A. Those n-sites make up a certain rf-fold — a 
curve when d — 1, a surface when d = 2. This <i-fold is important to CAGD 
because it can serve as a prototype for all polynomial, parametric d-folds 
of degree at most n. For example, when d — 1, the geometric structure of 
an n-ic Bezier curve is best understood by viewing that curve as an affine 
transform of the moment curve of degree n, the curve n n that maps a point 
P on its parameter line to the n-site K n (P) := P n . In a similar way, when 
d — 2, an n-ic Bezier triangular surface is best understood by viewing it as an 
affine transform of the Veronese surface of parametric degree n, the surface 
a n that maps a point P on its parameter plane to the n-site cr n (P) := P n . 



6.1 Quadratic sites over the line 

We begin with the case n = 2 and d — 1, looking for those quadratic sites 
over the line that are perfect squares. 

As a convenient line to work with, let's take the u axis of the plane 
A = {C + utp + vip | u, v e R} shown in Figure 4.1; that is, let's take the 
affine line L given by L := {C + wp \ u G R}. When our parameter space 
is a single line, it is convenient to adopt some scheme that names a point on 
that line using a single real number. So, for any real number t, let t denote 
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Figure 6.1: The plane of unit-weight 2-sites over the line L. 

the point on the line L whose li-coordinate is t; that is, we set t := C + tip. 

We want to study the linear space Sym 2 (L) of quadratic sites over the 
line L. We know of three different concrete constructions for that space, 
corresponding to the last three lines in Table 5.1: Sym 2 (L) = R 2 [C, p] = 
(R 2 [L]/ = Poly 2 (L*,R). In this section, we'll stick to the first of those 
three and study the space R2[C, p], a linear 3-space of quadratic polynomials: 

Sym 2 (L) = R 2 [C,p] = {r C 2 + xCp + yp 2 \ r,x,y e R} 

We can save one dimension by restricting our attention to those quadratic 
sites over L that have unit weight, that is, to the case r = 1; such sites form 
the affine plane 

Sym 2 (L) i = R 2 [C, p} 1 = {C 2 + xCp + y p 2 | x, y e R}. 

That plane is pictured, using x and y as coordinates, in Figure 6.1. 

By elementary algebra, we can plot various unit-weight 2-sites over L as 
points in Figure 6.1. For example, we have l 2 = (C + p) 2 = C 2 + 2 Ctp + p 2 , 
so we plot the 2-site l 2 at the spot x = 2 and y — 1. More generally, for any 
real number t, we have t 2 = (C + tp) 2 = C 2 + It Cp + t 2 p 2 , with x — 2t and 
y = t 2 . Thus, the squares of the points on L form a parabola in the plane 
of Figure 6.1: the parabola with equation x 2 — 4y = 0. Note that x 2 — 4y is 
the discriminant of the quadratic polynomial C 2 + xCp + yp 2 ; this makes 
sense, since the algebra of sites is essentially a polynomial algebra. 

Let's also plot the sites of the form Oi. We calculate Oi = C(C + tp) = 
C 2 + tCp. Thus, those sites constitute the x-axis in Figure 6.1, which is the 
tangent line to the parabola of squares at the 2-site 0 2 . 

The point 0 on the parameter line L is no different from any other point 
on L; so we suspect that, for any real number a, the tangent line to the 
parabola of squares at the 2-site a 2 should comprise those 2-sites of the form 
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ab, for varying b. The latter sites clearly form some line, and that line clearly 
passes through the site a 2 when b := a. All that remains to verify is the 



-2 



tangency. So consider how the 2-site a + h behaves, as the real number h 
tends to zero; we have 



a + h = (a + h y?) 2 = a 2 + 2h cup + h 2 (p 2 

= a(a + 2hip) + h 2 ip 2 = a a + 2h + h 2 ip 2 . 

Since h 2 goes to zero faster than h, the tangent line to the parabola at a 2 
is indeed the line whose 2-sites have the form ab. (Note that we carried out 
that analysis using (a, ip) as our basis for the linear space L of anchors over 
L, rather than the standard basis (C,ip) = (0, </?). There are many such 
situations where it is convenient to use some non-standard basis.) 

Let s = C 2 +xC(p+yp> 2 be any 2-site in the plane Sym 2 (L) J - of Figure 6.1. 
The site s lies outside the parabola just when its discriminant x 2 — Ay is 
positive. In that case, we can see geometrically that s is the intersection of 
the tangent lines to the parabola at a 2 and b 2 , for some two distinct real 
numbers a and b. So the 2-site s must have both a and b as factors, which 
means that s — ab must be the product of the points a and b on L. If we 
like, we can use the Quadratic Formula to compute a and b; we have 



x + ^x 2 - Ay \ ( x- ^x 2 -Ay 
C + xCip + yp> = CH ^- tp 6 H ^- tp 



and hence 



x 



+ >J x 2 — Ay \ ( x — y 'x 2 — Ay 




Those 2-sites s = C 2 +x Cip+y ip 2 that lie inside the parabola, with x 2 — Ay 
negative, don't factor as the product of two points over the real numbers. 
They would factor over the complex numbers; but the real numbers are the 
scalars of primary interest in CAGD. 

Warning: While the product of any n anchors is an n-site, it is by no 
means the case that every n-site splits as the product of n anchors. Here 
already, in studying Figure 6.1, we have examples of 2-sites that don't split: 
the ones inside the parabola of squares. Once the parametric dimension d 
exceeds 1, the sites that do split, even over the complex numbers, become a 
tiny minority of all sites; see Exercise 6.7-1. 

For any affine space A and for any nonnegative n, we shall say that an 
n-form on A is real-lineal when it splits — that is, factors completely - 
over the reals as the product of n coanchors on A. When the n factors are 
allowed to be complex, we'll use the term complex-lineal. Similarly, an n-site 
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is real-lineal or complex-lineal when it factors, over the reals or complexes, 
as the product of n anchors. The 2-sites in Figure 6.1 that are real-lineal are 
those that lie either on the parabola of perfect squares or outside it. All of 
the 2-sites in Figure 6.1 — indeed, all n-sites over the line L for every n - 
are complex-lineal. 

More generally, given any graded algebra (not necessarily commutative), 
a homogeneous element of grade n in that algebra is called lineal when it can 
be expressed as the product of n linear elements, that is, of n elements that 
are homogeneous of grade 1. Some synonyms for this sense of "lineal" are 
"simple", "totally decomposable" , and "completely reducible" . (A fine point: 
Under this definition, the multiplicative identity 1 is the only homogeneous 
element of grade 0 that is lineal, since 1 is the value of the unique empty 
product. So, for example, 1 is the only 0-form that is lineal and also the only 
0-site that is lineal. Some authors are more generous, calling an n-ic thing 
lineal whenever it can be written as a scalar multiple of a product of n linear 
things; under that definition, all scalars are lineal.) 

Note that the same geometric and algebraic properties that hold of sites 
hold also of forms, the only exceptions being those that involve the weight 
coanchor. For example, consider a quadratic form on the line L, an element 
/ := aw 2 + bwu + cu 2 of the linear 3-space Sym 2 (L*) = R^fusw]. The forms 
/ with b 2 — 4ac = 0 are plus or minus the square of a coanchor; they form 
a quadratic cone in the space Sym 2 (L*). The forms / that lie on or outside 
that cone have b 2 — 4ac > 0, and they factor as the product of two coanchors. 
Returning to the algebra of sites, we have precisely similar structures. A 
quadratic site s := r C 2 + x Cip + y (p 2 of weight r is plus or minus the square 
of an anchor just when x 2 — 4ry = 0, and the sites s that lie on or outside 
that cone, with x 2 — Ary > 0, are the ones that factor as the product of two 
anchors. In our analysis above, we restricted ourselves to the plane r = 1 of 
unit-weight sites, where that plane cuts the cone of squares in the parabola 
of Figure 6.1. The analogous restriction for quadratic forms would require 
the coefficient a of w 2 to be 1, that is, would restrict our attention to forms 
whose value at the point C happens to be 1. But such a restriction would 
be unnatural, since we made an arbitrary choice when we selected C as the 
center point of our coordinate system for the line L. 

6.2 The prototypical parabola 

The parabola in Figure 6.1 is the image of the affine line L under the squaring 
map, the map k 2 : L — > Sym 2 (L)^ that takes a point P on L as its argument 
and squares it: k 2 (P) := P 2 . Expressing P in terms of our standard basis 
P = u = C + u(P)<f, we have k 2 (u) = C 2 + 2u dp + u 2 ip 2 , as above. 

Now, suppose that we want to use an arc of a parabola as part of a 
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spline curve that we are designing. Let F: L — > O be that parabola, sitting 
in some plane in our object space O, and suppose we want to use the arc 
F([a . . b]). No matter what design methodology we employ, the parabola F 
will be an amne transform of the particular parabola k 2 . That is, there will 
exist an instancing transformation, an afhne map /: Sym 2 (L)^ — > O with 
F(P) = f(n 2 (P)) = f(P 2 ), for all points P on L. In this way, the particular 
parabola k 2 can serve as a prototype for all parabolas. 

To add the parabolic arc F([a . . b]) to our design, it suffices to specify 
the instancing transformation /. And one simple way to specify the map / 
is to specify the images under / of the three sites a 2 , ab, and b 2 ; note that, 
whenever the real numbers a and b are distinct, those three sites constitute 
an affine frame for the plane Sym 2 (L)^ of Figure 6.1. The images of those 
three sites under / are, of course, the three Bezier points of the parabolic 
segment F([a . . &]). To see this algebraically, suppose that the point P on L 
is located t of the way from a to b, so that P = (l — t)a + tb. We then have 

F(P) = /(«2(P)) = f(P 2 ) 
= f(((l-t)a + tb) 2 ) 
= f((l-t) 2 a 2 + 2t(l - *) ab + t 2 b 2 ) 
= (1 -t) 2 f(a 2 ) + 2t(l -t)f(ab) + t 2 f(b 2 ). 

In degenerate cases, we might choose the three Bezier points /(a 2 ), f{ab), 
and f(b 2 ) to be collinear, or even choose all three to coincide. The instancing 
transformation / would then collapse the plane of Figure 6.1 down either to a 
line or to a point. But this collapsing happens only to our parabolic instance 
F, not to the prototypical parabola k 2 . 

6.3 The moment curves 

In a similar way, the n th -power map n n : L — > Sym n (L)-'- defined by K n {P) := 
P n provides a prototype for all polynomial curves of degree at most n. For 
example, when n = 3, we have P = (C + tip) 3 = C 3 + 3t C 2 ip + 3t 2 Cip 2 +t 3 y? 3 , 
so our prototypical cubic is the twisted cubic curve (x,y,z) := (3t,3t 2 ,t 3 ), 
sitting in the affine 3-space 

Sym 3 (L) i = R 3 [C,<p} 1 = {C 3 + xC 2 ip + y Cip 2 + z ip 3 \x,y,ze R}. 

In projective geometry, the curve that results from the analogous construction 
is called the rational normal curve of degree n. Since we are working in an 
affine space, instead of in its projective closure, we'll refer to K n by its other 
name: the moment curve of degree n. 

The tangent lines, osculating planes, and so forth of the moment curve 
K n are related to the multiplication in the algebra of sites as follows. 
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Proposition 6.3-1 Let K n : L — > Sjm n (L)^ be the moment curve of degree 
n given by n n (P) : = P n , for all points P on the affine line L. A unit-weight 
n-site s over L lies in the affine k-Rat that osculates the curve n n to k th order 
at P n just when the (n — k)-site P n ~ k divides s. 

Proof As the real number h tends to 0, we have 

TTh n = (t + M n = Yl ( n V n ~W 
= C^t^hY + o(h k+1 ) 

0<i<k \ l ' 

= i n ~ k ( n V~W + o(h k+1 ). 

0<i<k \ l ' 

Thus, the moment curve n n is approximated to k th order, near the n-site i n , 
by the fc-flat that consists of all multiples of t n ~ k . □ 

For example, consider the 3-site ab 2 . The one factor of a puts us in the 
osculating plane to the twisted cubic « 3 at a 3 , while the two factors of b put 
us in the osculating plane at b 3 twice, that is, on the tangent line at b 3 . So 
the 3-site ab 2 sits where that tangent line cuts that osculating plane. 

The moment curve K n can serve as a prototype for all n-ic polynomial 
parametric curves. Given any such curve F: L — > O, sitting in some object 
space O, there exists a unique affine map /: Sym n (L)^ — > O that realizes F 
as an instance of the prototype K n , that is, that satisfies F(P) = f(K n (P)) = 
f(P n ), for all points P on L. Given some parameter interval [a. .b] on L, one 
convenient way to determine which n-ic segment F([a . . b]) we want in some 
design of ours is to specify the instancing transformation / by specifying the 
images under / of the n-sites a n , d n ~ 1 b, through b n , those images being the 
Bezier points of the segment F([a . . b]). 

Note that the instancing transformation / may well fail to be injective. 
Indeed, the prototypical cubic k 3 is twisted, spanning the affine 3-space 
Sym 3 (I/)^. So, the instancing transformation for any planar cubic segment 
will definitely fail to be injective; its four Bezier points will be coplanar. 
When the instancing transformation / fails to be injective in this way, the 
differential geometry in the object space gets affected. For example, all of 
the osculating planes of a planar cubic coincide, so we can't construct the 
point f(abc) geometrically by intersecting the osculating planes to F at the 
parameter values a, b, and c. But the differential geometry of the prototype 
is not affected. We can still intersect the osculating planes to « 3 at a 3 , b 3 , 
and c 3 to find the 3-site abc and then apply the instancing transformation /. 
The resulting point f(abc) is the blossom value F(a, b, c), as we discuss next. 
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6.4 The relationship to blossoming 

Let F: L — > 0 be an n-ic, parametric curve in some affine object space 0. 
The polar form or blossom of F is the unique symmetric, n-affine function 
F: L n — > O that agrees with F on the diagonal, that is, that satisfies 

F(P____P)=F(P). 

The term "polar form" points out the relationship to polarization in other 
contexts; but we are using the term "blossom" in this monograph, to alleviate 
overuse of the word "form". The blossom gives us an enlightening way to 
name the Bezier points of any segment of F, the k th Bezier point of the 
segment F([a . . b]) being the blossom value 

F(a, . . . , a, b, . . . , b). 

n—k k 

But the algebra of sites gives us an even better naming scheme for Bezier 
points. We realize the particular n-ic curve F as an affine transform of 
the prototype n n ; that is, for some affine map /: Sym n (L)^ — ■> O, we have 
F(P) = f(K n (P)) = f(P n ), for all points P in L. It immediately follows 
that the blossom of F is given by 

F(P 1 ,...,P n ) = f(P 1 ..-P n ), 

since that right-hand side is clearly symmetric, n-affine, and agrees with F 
on the diagonal. In particular, we can now write the k th Bezier point of the 
segment F([a . . b]) simply as f(a n ~ k b k ). 

This is a significant notational improvement. By exploiting exponential 
notation, we can now write down the k th Bezier point in running text, without 
requiring a displayed formula and horizontal braces. 

Furthermore, that improved notation is just one of the rewards for an 
underlying conceptual advance: replacing concatenation with multiplication. 
When computing a blossom value F(P 1: . . . , P n ), we assemble the points 
Pi through P n by concatenating them into a sequence. Concatenation is 
automatically associative; but we must explicitly require the blossom F to 
be symmetric in order to get commutativity.^ Multiplication is better in every 
way: It is automatically both associative and commutative; it also distributes 

^Wc could build in commutativity by using a multiset (a.k.a. bag or suite) as the input 
to the blossom, rather than a sequence. But multisets are unfamiliar, and they introduce 
their own notational challenges. For example, the domain space of the blossom would then 
be the set of all multisets of size n whose n elements arc points on the parameter line L 
— a set for which there is no standard notation. 
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Figure 6.2: The de Casteljau Algorithm on the prototypical cubic k 3 

over addition; and the notations associated with it are more concise, to boot. 
Thus, moving from concatenation to multiplication is a big win. 

Two things combine to make that win available to us: First, the curve K n 
is rich enough to serve as a prototype for all polynomial n-ics; and second, 
that prototype n n is defined, not in some arbitrary way, but by exploiting 
the multiplication in an algebra. 

Why is n n rich enough to serve as a prototype? Because all of the higher- 
order, nonaffine stuff that has to happen as part of evaluating an n-ic curve F 
at an argument point P already happens as part of computing K n (P). Once 
we know the n-site K n (P), we can compute F(P) = /(« n (P)) by applying 
the instancing transformation /, which is an affine map. But that property 
of n n is shared by lots of other curves — indeed, by every n-ic polynomial 
curve that is twisted in n different dimensions, so as to have an affine span 
that is n-dimensional. 

The second key point about the prototype n n is that we have defined 
it using the multiplication in the algebra of sites, setting K n (P) := P n . 
The geometric structure of the de Casteljau Algorithm simply reflects the 
multiplicative structure of that algebra. Figure 6.2 shows the de Casteljau 
Algorithm working on the moment cubic K3. 

We don't cover the details in this monograph, but a similar result holds 
in the rational case. Every parametric rational n-ic curve is a projective 
transform of the n-ic rational normal curve of algebraic geometry, which is 
the projective closure of the n-ic moment curve K n . 

Exercise 6.4-1 Which cubic sites over the affine line are real-lineal? What 
does this question have to do with the twisted cubic K3 of perfect cubes? 

Answer: The discriminant of the 3-site s = rC 3 + x C 2 (p + y C (p 2 + z (p 3 is 
18rxyz + x 2 y 2 — 4ry 3 — 4x 3 z — 27r 2 z 2 . This discriminant is zero precisely on 
the ruled surface that is swept out by the tangent lines to the twisted cubic 
K3, that is, precisely for those sites s that are divisible by a perfect square. 
When the discriminant is positive, which it is on one side of that quartic 
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ruled surface, the site s splits, over the reals, as the product of three distinct 
anchors. When the discriminant is negative, the site s has, as its factors, one 
real anchor and one pair of conjugate, complex anchors. 

Exercise 6.4-2 What does the discriminant of a quartic site over the affine 
line tell us about whether or not that site is real-lineal? 

Answer: The 4-site r C 4 + w C 3 ip + x C 2 f 2 + y dp 3 + zip A has, as its dis- 
criminant, a homogeneous sextic polynomial in the coefficients (r,w,x,y, z) 
with sixteen terms. Readers who want the details can type this to Maple [8] : 

discrim(r*t~4 + w*t~3 + x*t~2 + y*t + z, t) ; 

The zero-set of this discriminant is the 3- fold in the 4-space Sym 4 (L)^ whose 
4-sites are divisible by a perfect square. Such 4-sites may have the form 
P 2 QR, P 2 EE, or E 2 E 2 , where P, Q, and R are real points on the line L, 
while E is a complex point on L and E is its conjugate. Thus, 4-sites with zero 
discriminant may have 4, 2, or 0 real anchors as factors. Note that the sites 
of the first two types are swept out by the osculating planes to the moment 
quartic k 4 , as P moves along L. A 4-site with negative discriminant has, 
as its factors, two distinct real anchors and one pair of conjugate, complex 
anchors. A 4-site with positive discriminant may have either four distinct 
real factors or else two pairs of conjugate, complex factors. 

6.5 The Veronese surface 

Now that we have some intuition for how the n th -power map behaves on the 
line L, let's consider how the squaring map behaves on the plane A; that is, 
let's return to the case n = 2, but now with d = 2. Let a 2 : A — > Sym 2 (A)-L 
be the map defined by cr 2 (P) := P 2 , for each point P on the plane A. The 
image a 2 (A) is a curved surface (that is, a 2-fold) sitting in the affine 5-space 

Sym 2 (i)^=R 2 [C>,# 

= {C 2 + bip 2 + cip 2 + x ipip + yCip + z dp | b, c, x,y,z e R}. 

The projective completion of the image a 2 (A) is called the Veronese surface 
in algebraic geometry. To allow for higher degrees, we'll call it the Veronese 
surface of parametric degree 2. (Warning: The parametric degree is different 
from the degree of the surface itself. Indeed, the Veronese surface c 2 (A), 
as a variety in 5-space, actually has degree 4. More generally, the Veronese 
d-fold of parametric degree n, as a variety in projective space of dimension 
{ n+d ) — 1, turns out [31] to have degree n d .) 

Like the moment curve K n , the Veronese surface a 2 is a prototype — a 
prototype for all parametric polynomial surfaces of degree at most 2. Let 
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F: A — > O be any parametric surface in an affine object space 0 whose 
coordinates are given by polynomials of total degree at most 2 in u and 
v, the coordinates on A. Then, there exists a unique affine transformation 
/: Sym 2 (Ay -> O with F(P) = f(a 2 (P)) = f{P 2 ). Note that the domain of 
the instancing transformation / here is 5-dimensional. If our surface instance 
is to sit in 3-space, as is typically the case in CAGD, then the instancing 
transformation / can't possible be injective. 

Bezier points and blossoming for quadratic surfaces hold no surprises. 
The blossom of the quadratic surface F is given by F(P,Q) = f(PQ), for 
all points P and Q in the plane A. Given a reference triangle ARST in A, 
the six 2-sites R 2 , RS, RT, S 2 , ST, and T 2 form an affine frame for the 
5-space Sjm 2 (A)^ , and we often specify an affine instancing transformation 
/ by giving the images of these six frame points under /, those images being 
the Bezier points of the quadratic triangular patch F(ARST). 

Which 2-sites in the affine 5-space Sym 2 (A)^ = R-2[C, <p, tp]^ are lineal? 
That is, for which coefficients (b,c,x,y, z) do there exist coefficients (u±,vi) 
and (1x2,^2), either real or complex, with 

s = C 2 + bip 2 + cip 2 + xipip + yCip + zCip 
— (C + ui(p + viip)(C + u 2 (p + 

Since there are five parameters on the first line and only four on the second, 
it is clear that a typical 2-site s is not even complex-lineal. Instead, the five 
coefficients (b, c, x, y, z) must satisfy one constraint in order for the resulting 
2-site s to have any hope of factoring. That constraint is encoded by a 
polynomial, which is again referred to as the discriminant. To write that 
discriminant more symmetrically, let's abandon the constraint of unit weight, 
replacing the term C 2 with aC 2 . It turns out that the 2-site 

s= aC 2 

+ zC(p +yCip 
+ bip 2 +xipip + cip 2 

factors over the complexes as the product of two anchors just when 
(6.5-1) A(a, b, c, x, y, z) := Aabc + xyz — ax 2 — by 2 — cz 2 = 0. 

Polynomials that have no nontrivial factors, even over the complex num- 
bers, are called absolutely irreducible, and we shall apply that term also to 
forms and sites. Thus, a quadratic site s over the plane A is either complex- 
lineal or absolutely irreducible, according as its discriminant A is zero or 
nonzero. If s is complex-lineal, it may or may not factor also over the reals. 

Warning: The word "discriminant" is used whenever some condition can 
be tested by a single polynomial, regardless of what condition that might be. 
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In the case d — 1 of sites over a line (as for univariate polynomials), every 
n-site is complex-lineal, and the discriminant tests whether some two of those 
n factors coincide. In the case d — 2 of sites over a plane (as for bivariate 
polynomials), most n-sites are absolutely irreducible. For the particular case 
n = d = 2, there is a single polynomial that tests for absolute irreducibility: 
the discriminant polynomial A in Equation 6.5-1. 

Exercise 6.5-2 Assuming that the coefficients (a,b,c,x,y,z) of the 2-site 
s = aC 2 + zCip + y Cip + bip 2 + x (ptp + ctp 2 are real and that the discriminant 
A (a, b, c, x, y, z) is zero, when will s be real-lineal? 

Hint: In order for s to factor over the reals, the three inequalities 

x 2 - Abe > 0 
y 2 - Aac > 0 
z 2 -Aab>0 

are clearly necessary, and it turns out that they are also sufficient. 

A fine point: Given that A = 0, any two of those three inequalities 
are almost enough to imply the third. For example, the last two imply the 
first except in the degenerate case a = y = z = 0, where A = 0 holds 
automatically and the last two inequalities hold automatically as equalities. 

Exercise 6.5-3 Which 3-sites over the plane A are complex-lineal? 

Answer: The space Sym 3 (A)^ of unit- weight 3-sites is 9-dimensional, 
while the three factors of a lineal 3-site have only 6 degrees of freedom among 
them. Hence, there are 9 — 6 = 3 dimension's worth of algebraic constraints 
that must hold, among the coefficients of a 3-site, in order for that site to be 
complex-lineal. Unfortunately, while one algebraic constraint can always be 
encoded by a single polynomial, it typically takes more than k polynomials 
to encode k dimension's worth of constraints, the extra polynomials being 
required to eliminate spurious roots. To say the same thing in more modern 
language, most varieties of codimension k > 1 are not complete intersections. 
In this exercise, requiring a 3-site over the plane A to be complex-lineal in- 
volves 3 dimension's worth of constraints; but the most efficient encoding 
of those constraints that I know of uses 45 polynomials, each of which is a 
quartic in the ten coefficients of a 3-site of arbitrary weight [45]. 

6.6 Degen's analysis of quadratic surfaces 

While this monograph proposes a new framework for research in CAGD, 
most of the results from CAGD that we discuss — such as Bezier points - 
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are quite basic. In contrast, this section comes closer to current research. 
We here use the multiplication in the algebra of sites to illuminate Wendelin 
Degen's 1994 analysis [15] of quadratic surfaces. 

In that paper, Degen analyzes the various types of surfaces that can oc- 
cur as parametric rational quadratics (a.k.a. quadratic Bezier triangles) in 
3-space. The prototype for such a surface is the Veronese surface (72(A) = 
{P 2 I P e A}, consisting of the squares of the points in the plane A. That 
surface of squares, we recall, is a quartic surface, sitting in the 5-space 
Sym 2 (A)^. The instancing transformation for a quadratic Bezier triangle 
projects the surface (72(A), sitting in this 5-space, down into an object 
3-space; so the dimension goes down by 2. A projection that reduces the 
dimension by 1 projects along those lines that pass through a certain center 
point. To reduce the dimension by 2, we must project, instead, along those 
planes that pass through a certain center line, a line A in 5-space. The heart 
of Degen's analysis considers the various geometric relationships that the 
center line A can have, both with the Veronese surface (12(A) itself and with 
the cubic 4-fold of complex-lineal sites, that is, the 4-fold A = 0 characterized 
by the vanishing of the discriminant A in Equation 6.5-1. 

Degen's analysis is correct, complete, and pretty; but Degen missed some 
opportunities because he worked purely geometrically with the 5-space in 
which the Veronese surface sits. In fact, that 5-space Sjm 2 (A)^ lies in the 
algebra of sites Sym(A). We here exploit the multiplication of that algebra 
to give us a new, more algebraic perspective on Degen's results. 

By the way, Degen used this same method of projection to tackle other 
problems [18], in each case studying the geometric relationships between a 
Veronese prototype and the central flat of the instancing transformation that 
projects that prototype down into some object space. Whenever Veronese 
prototypes are exploited in this way, the multiplication of the algebra of sites 
may be a helpful algebraic adjunct to more geometric reasoning. 

Warning: This section is rather technical; indeed, one of its goals is to 
show how the algebra of sites performs when put to a significant test. Some 
readers may prefer to skip on to Section 6.7. 

We are trying not to rely on projective geometry in this monograph. So 
the only quadratic surfaces that we can handle are the polynomial ones, 
which are produced by projecting the Veronese surface 02(A) down from 
5-space into 3-space along a family of parallel planes. Speaking projectively, 
the center line A of such a projection is a line at infinity: the line at infinity 
where all of those parallel planes intersect. In discussing Degen's analysis, 
however, we don't want to restrict A to be a line at infinity. Fortunately, we 
don't need projective geometry in order to discuss how an arbitrary line A in 
5-space relates to the Veronese surface a 2 (A) and to the cubic 4- fold A = 0. 
We would need projective geometry to perform a projection from a line A 
that wasn't at infinity; but we won't discuss that projection. 
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6.6.1 The planes in the 4-fold A = 0 

There turn out to be two families of planes that lie entirely within the cubic 
hypersurface A = 0, and those planes are important in analyzing the various 
degenerate ways in which the center line A can interact with the hypersurface 
A = 0. What such planes can we think of? 

First, let L denote any line in the affme plane A. The image of L under 
the Veronese map a 2 will be a parabola, lying in some plane II — a plane 
that looks, in fact, just like Figure 6.1. All of the 2-sites in the plane II factor 
over the complexes. The 2-sites on the parabola itself are the squares of the 
points on L; the 2-sites outside the parabola factor as PQ, where P and Q 
are distinct, real points on L; and the 2-sites inside the parabola factor as 
the product of a pair of conjugate complex points on L. Thus, the entire 
plane II lies inside the hypersurface A = 0 of complex-lineal sites. 

Second, let P be some point in A, and consider the plane Tp := {PQ \ 
Q E A}. Every site in the plane T P factors over the reals, so T P lies inside 
the hypersurface A = 0. In fact, the plane T P is the tangent plane to the 
Veronese surface <r 2 at P 2 , as we shall see in Section 6.7. 

So we have two families of planes, the first indexed by lines in A and the 
second by points in A. How do planes from those two families intersect? 

Consider first two planes II and Im from the first family. If the lines L 
and M coincide, then the image planes II and Im also coincide, obviously. If 
the lines L and M are distinct, they typically intersect in a unique point P. 
(In affine geometry, L and M might be parallel; but let's not worry about that 
case, since projective geometry stands ready to deal with all of the special 
cases caused by parallelism.) The 2-sites that lie in 1^ factor as the product 
of two points — possibly conjugate complex — along L, and the analogous 
claim holds for I M - Thus, the 2-site P 2 is the unique site that belongs to 
both II and Im- So two distinct planes from the first family intersect in a 
unique site, and that site lies on the Veronese surface. 

It's a similar story for two planes Tp and Tq from the second family. If 
P = Q, then the tangent planes Tp and Tq coincide. Otherwise, the 2-site 
PQ is the unique site that belongs to both T P and Tq. But note that the 
site PQ lies off of the Veronese surface, rather than on it; in that detail, the 
second family differs from the first. 

What about a plane II from the first family and a plane Tp from the 
second? If P lies on L, then the planes II and Tp intersect in the entire line 
of 2-sites {PQ | Q £ L}. But, if P does not lie on L, then the planes II and 
T P are skew. 

The cubic 4-fold A = 0 thus has two 2-parameter families of planes 
that lie inside it, where two planes drawn from the same family intersect 
in a flat of even dimension, while two planes drawn from different families 
intersect in a flat of odd dimension. That geometry is hardly surprising, since 
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the situation for a nonsingular quadric 4-fold in 5-space is the same, except 
that each of the two families of planes lying inside a nonsingular quadric 
is a 3-parameter family. The moral of the story is not that the geometry is 
surprising, but rather that we can uncover that geometry easily by exploiting 
the multiplication in the algebra of sites. 

One place where these two families of planes arise is in Degen's handling 
of degenerate cases. Recall that Degen classifies the different ways in which 
the center line A of a proposed projection can intersect the cubic hypersurface 
A = 0. The most degenerate thing that can happen is for A to lie entirely 
within the hypersurface A = 0. Degen proves [16] that any such line A lies, 
in fact, either entirely inside the image plane II, for a unique line L in A, or 
entirely inside the tangent plane Tp, for a unique point P in A. (Those two 
options can happen simultaneously, if A is the line where II intersects T P , 
for some point P on some line L.) 

Exercise 6.6-1 Given two distinct points P and Q in A, on how many 
planes of each family does the site PQ lie? What about the site P 2 ? 

Answer: The site PQ lies on one plane of the first family, the plane II 
where L is the line joining P to Q. It lies on two planes of the second family, 
the planes Tp and Tq. As for the site P 2 , it lies on a one-parameter family 
of planes of the first family: on the plane II, for each line L passing through 
P. It lies on just one plane of the second family, the plane T P . 

6.6.2 The typical complete quadrilateral 

Let's now turn from the most degenerate things that can happen to the thing 
that happens typically: Our proposed center line A typically intersects the 
cubic hypersurface A = 0 at three distinct sites. Let's further assume that 
all three of those sites of intersection are real; this is the case that Degen [17] 
refers to as (Aa). And finally, for simplicity, let's assume that each of the 
three sites of intersection factors, not only over the complexes, but actually 
over the reals. So, for some three pairs of points {P\,Qi}, {P2,Q2}, and 
{P3, Q3} in A, our center line A intersects the hypersurface A = 0 precisely 
at the three sites P1Q1, P2Q2, and P3Q3. What geometric relationships hold 
among the P's and Q's? 

Given two of the three pairs, say {Pi,<5i} and {P2,Q2}, the third pair 
{P 3 , Q 3 } is determined, since the line joining the site P\Q\ to P2Q2 intersects 
the hypersurface A = 0 at those two sites and at one more site, which must 
be P3Q3. But we can also locate P3 and Q3 by carrying out a geometric 
construction in the plane A. To do so, note that the four points Pi, Qi, P2, 
and Q2 are coplanar in A, so they must be linearly dependent; that is, there 
must exist real numbers a±, bi, — a 2 , and b 2 with 



aiPi + &1Q1 - a 2 P 2 + b 2 Q 2 = 0. 
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(We write — a 2 rather than a 2 in order to make what follows symmetric 
under cyclic permutations of the subscripts 1, 2, and 3.) To save writing, 
let's introduce p\ as an abbreviation for the anchor pi := a±P 1 , and similarly 
for qi := b±Qi, p 2 := CI2-P2, and q 2 '■— b 2 Q 2 . In terms of those anchors, we 
have the dependence 

(6.6-2) pi + gi - p 2 + q 2 = 0. 

Using that dependence, we find that 

Piqi + piqi = piqi + piqi + (pi + qi - pi + qi)qi 

= Piqi + Piq 2 + qiq 2 + ql 
= (Pi + q2)(qi + q2)- 

Thus, the line joining P\Q\ to P 2 Q 2 passes through the site {pi+qi)(qi+qi) = 
(aiPi + b 2 Q 2 ){biQi + b 2 Q 2 ); so the two anchors in that product must be 
scaled versions of the points P 3 and Q 3 , in some order. Choosing an order 
and undoing the scaling, we set 

P 3 := — t and Q 3 := — — — . 

ai + 6 2 bi + b 2 

(If either of those denominators is zero, the corresponding point is at infinity; 
let's not worry about that possibility.) If we now set a 3 := ai + b 2 , b 3 := 
— {b\ + b 2 ), p 3 := GI3-P3, and q 3 := 63Q3, we have established the following four 
equations, the first three of which are cyclically symmetric: 

qi - Pi + P3 = 0 

Pi + qi - P3 = 0 

~P\ + Pi + <?3 = 0 

qi + qi + <?3 = 0 

The last two equations reveal that Q3 is the point where the line P\ V P 2 
intersects the line Q\ V Q 2 , while the first two reveal that P3 is given by 
P 3 = (P 2 v Qi) A (Pi V Q 2 ). (Note that we are writing S V T to denote 
the line joining S to T, to avoid confusion with the 2-site ST.) To say the 
same thing more symmetrically, of the eight possible combinations of one 
point from each pair, the four collinear triples are the ones with an even 
number of P's and an odd number of Q's. So the three pairs of points 
{Pi, Qi}, {P 2 , Q 2 }, and {P 3 , Q 3 } are the pairs of opposite vertices of a com- 
plete quadrilateral, as shown in Figure 6.3. Thus, a center line A in 5-space 
typically corresponds to a complete quadrilateral in the plane A in the sense 
that the three intersections of A with the hypersurface A = 0 are the products 
of that quadrilateral's three opposite pairs of vertices. That's neat. 
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Figure 6.3: The complete quadrilateral from a typical center line A 



Geometry remark: The complete quadrilateral captures the most nondegen- 
erate way in which three lineal 2-sites PiQi, P2Q2, and P3Q3 over a plane 
A can be collinear in the space Sym 2 (A)K For other values of n, is there 
an analogous geometric configuration that captures the most nondegenerate 
way in which n + 1 lineal n-sites over an n-space A can span a flat in the 
space Sjm n (A)^ whose dimension is less than n? When n = 3, the answer is 
yes. Ramshaw and Saxe [44] analyze the solution for n = 3, a configuration 
that captures the coplanarity of the four 3-sites P1Q1R1, P2Q2R2, P3Q3P3? 
and P4Q4-R4 by constraining those twelve points in 3-space to be incident to 
two lines and thirteen planes, in a pattern described by the budget matroid 
-62,1,1. Since complete quadrilaterals, which are the solution for n = 2, are 
representations of the budget matroid -02,1; this suggests a pattern. Unfortu- 
nately, it seems that the representations of the budget matroid -82,1,1,1 cannot 
provide an analogous solution when n = 4, since it seems that there are only 
30 dimensions' worth of such representations, rather than the 34 dimensions' 
worth that would be required. 

Warning: Ramshaw and Saxe [44] don't exploit the multiplication in the 
algebra of sites; hence, they don't prove that the four products (-Pj<5i-Rj)i<j<4 
are coplanar. (Indeed, I was only beginning to realize, back then, that it 
makes sense to multiply points.) But they do prove a certain property of the 
slopes of the twelve planes that result when those twelve points are projected 
from an arbitrary line in 3-space, and that slope property turns out to be 
equivalent to the coplanarity of the four products. 

The dashed lines Pi V Qi, P2 V Q2, and P 3 V P3 in Figure 6.3 are called 
the diagonals of the complete quadrilateral, and they have a role to play 
in Degen's analysis as well. Let Pj, for % from 1 to 3, be the vertex of 
the dashed triangle that is opposite the side P, V Qi, as shown in Figure 6.3. 
Equation 6.6-2 tells us that Pi + qi = P2 — Q2- The point P3 is the intersection 
of the diagonals Pi V Qi and P2 V Q2, so P3 must be a scalar multiple of the 
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Figure 6.4: The plane projector that corresponds to the triple point 

anchor d 3 :— p 1 + q 1 = p 2 — q 2 - Introducing scalar multiples of D 1 and D 2 in 
a similar way, we have 

di := P2 + q2=Pz- <?3 
4 := P3 + <?3 = Pi - qi 

d 3 ■= Pi + qi=P2- q2- 

These formulas reveal some interesting collinearities in the 5-space Sym 2 (A)^, 
as shown in Figure 6.4. The simple identity 

(pi - qif + 4 pi?i = (pi + qif 

shows that the 2-sites D|, P\Q\, and D| are collinear; and the same holds 
with the subscripts cyclically permuted. Any two of those three collinearities 
suffice to show that the plane in 5-space spanned by the three perfect squares 
Df, D%, and D% contains our entire proposed center line A, the line through 
the collinear 2-sites PiQi, P2Q2, and P3Q3. Thus, when Degen takes the 
planes in 5-space through the line A as the points of his object 3-space, 
the particular plane shown in Figure 6.4 will belong to the resulting surface 
instance — the projected image of the Veronese surface (72(A) — for three 
different reasons. In fact, that point in 3-space is the triple point of the 
resulting quartic Steiner surface. 

Recall what happens when we project the twisted cubic curve k 3 (L) in 
3-space, from some center point, to get a rational cubic curve in the plane. 
For a typical choice of the center point C, there is a unique line through C 
that intersects k^(L) twice. That is, most points lie on a unique chord of 
the twisted cubic, that chord giving rise to a double point on the projected, 
planar cubic. If we define a 2-chord of the Veronese surface 172(A) to be the 
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plane spanned by the squares of three noncollinear points in the plane A, 
Degen's analysis shows that most lines A in 5-space lie on a unique 2-chord 
of the Veronese surface, that 2-chord giving rise to a triple point on the 
projected surface in 3-space. 

Exercise 6.6-3 The triple point of the Steiner surface in case (Aa) is the 
intersection of three noncoplanar double lines, that is, lines where the Steiner 
surface intersects itself. A point on one of those double lines corresponds to 
a plane projector through A that contains both S 2 and T 2 , for some two 
distinct points S and T in the plane A. Where do the points S and T in 
such a pair lie in Figure 6.3? 

Answer: For some % between 1 and 3, the points S and T are harmonic 
conjugates of Pj and Qi, along the dashed line p V Qi. At the triple point 
itself, any two of the three points D 1: D 2 , and D 3 can play the roles of S and 
T; for example, D 1 and D 2 are harmonic conjugates of P 3 and Q 3 . 

6.6.3 Extending Degen's analysis to cubics 

Extending Degen's analysis from quadratics to cubics would be a challenging 
endeavor in which the algebra of sites might well prove useful. 

The prototype for a cubic Bezier triangle is the Veronese surface of para- 
metric degree 3, the surface (73(A) = {P 3 | P e A}. This surface has degree 
3 2 = 9 and sits in the 9-space Sym 3 (A)^. To end up with a surface in an 
object 3-space, the instancing transformation must reduce the dimension by 
6; thus, it will project from some central 5-flat H, sitting in 9-space, down 
into the 3-space of all 6-flats that include H. The character of the Bezier 
triangle that results from this projection will presumably depend upon how 
the 5-flat H sits in 9-space, in relation to such structures as the following: 

• the surface 03(A) itself, the 2-fold of 3-sites that are perfect cubes; 

• the 4-fold of 3-sites that are divisible by a perfect square; 

• the 6-fold of complex-lineal 3-sites, the sites that are divisible by some 
three anchors, typically all distinct; 

• and the 7-fold of reducible 3-sites, the sites that have some anchor as a 
factor, but where the quadratic cofactor may be absolutely irreducible. 

Indeed, some of these dependencies are straightforward; for example, the 
degree of the projected Bezier triangle will be 9 minus the number of points 
where the central 5-flat H intersects the Veronese surface (73(A) itself. But 
other dependencies will be more subtle. We leave those questions open, 
returning in a moment to the main thread of this monograph. 
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By the way, it will follow from Proposition 6.7-2 that the second structure 
in the list above, the 4-fold, is the union of the tangent planes to the Veronese 
surface <t 3 ; that is, for a point P in the plane A and a 3-site s over A, the 
site s is divisible by P 2 just when s lies in the tangent plane to the surface 
o"3 at P 3 . In a similar way, the fourth variety in the list above, the 7-fold, is 
the union, for all points P in the plane A, of the 5-flat that osculates er 3 to 
first order at P 3 . Note that each such osculating flat is 5-dimensional, since 
it is spanned by the two first-derivatives dP 3 /du and dP 3 /dv and the three 
second-derivatives d 2 P 3 /du 2 , d 2 P 3 /dudv, and d 2 P 3 /dv 2 . 

Degen himself extended his work in a different direction by classifying the 
types of surfaces that can arise as tensor-product surfaces of bidegree (2; 1) 
in 3-space — that is, by classifying the Bezier rectangles that are quadratic 
in one parameter direction and affine in the other [18]. We discuss tensor- 
product surfaces in Section 6.8. 

6.7 Polynomial d- folds of degree at most n 

The theory of Veronese prototypes generalizes to n-ic rf-folds, for any bound 
n on the total degree and any parametric dimension d. Let A be our affine 
domain space, now of dimension d. The linearization A has dimension d+ 1, 
so the algebras Sjm(A) and Sym(A*) of sites over A and forms on A are 
essentially polynomial algebras with d + 1 variables. An n-ic polynomial 
parametric <i-fold, sitting in some affine object space O, is a map F: A — > O 
that can be given by polynomials of total degree at most n in the coordi- 
nates on A. There is a prototype for all such ci-folds F: the n th -power map 
6d,n'- A Sym n (A) J - given by Od, n (P) '■= P n , f° r all points P in the ci-space 
A. In particular, for any n-ic parametric <i-fold F: A — > O, there is a unique 
affine transformation /: Sym n (i)^ -> O with F(P) = f(9 d , n (P)) = f(P n ), 
for all points P in A. We'll refer to Q^ n as the Veronese d-fold of parametric 
degree n; the map Q^ n is also known as the n-uple embedding of d- space [33]. 
The moment curve K n is the Veronese 1-fold K n = #i iTt , while the Veronese 
surface a n is the Veronese 2-fold a n = 02, n - 

What is a Bezier point in this context? We choose some <i-simplex of 
reference, say [Ro, ■ ■ ■ , Rd] in A. The points (Rq, . . . , Rd) form an affine 
frame for the affine space A and also form a basis for its linearization A. 
Hence, we can view the algebra of sites Sym(A) as the polynomial algebra 
R[i? 0 ; • • • 5 Rd]- Consider the n-sites _Rq° . . . R^, where i 0 through i d are any 
nonnegative integers with i 0 + ■ ■ ■ + i d = n. There are ( n ~lf) such sites, 
and they form an affine frame for the affine space Sym n (A)^ of all unit- 
weight n-sites over A. One convenient way to specify which affine instancing 
transformation /: Sjm n (A)^ -^Owe have in mind is to specify the images 
of the sites in that frame, each image f(R l Q . . . R d d ) being a Bezier point of 
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the resulting rf-fold. 

Exercise 6.7-1 Consider (real) n-sites of unit weight on an affine ci-space. 
How many dimension's worth of such sites are there altogether? How many 
are complex-lineal? Are real-lineal? Are perfect n th powers? 
Answers: ( n ^ d ) — 1, nd, nd, and d. 

What about the flat that osculates the Veronese ci-fold d d;n to k th order 
at the n-site 0 d>n (P) = P n , for some k < n? The only surprising thing about 
that osculating flat is its dimension, which is ( k ~]f) — 1. Let (C,(p±,..., (p d ) 
be some Cartesian reference frame for the space A, and let (w,ui, . . . ,u d ) 
be the dual basis for the linear space A* of coanchors. If F: A — > O is any 
parametric ci-fold, the flat that osculates F to 0 th order at P is the point 
F(P). For the flat that osculates F to 1 st order at P, we expand to include 
the d vectors dF/du 1 (P) through dF/du d (P). For osculation to 2 nd order, 
we include ( d ~^ 1 ) second-order partials, either pure or mixed: d 2 F/du 2 (P), 
d 2 F / ' du\du2{P) , and so on. Osculation to 3 rd order adds in ( d ~^ 2 ) third-order 
partials. For osculation to k th order, we have a total of 

d\ fd+l\ fd + k-l\ fd + k\ 

vectors. In the particular case where F = 9 dn is the Veronese ci-fold of some 
degree n > k, all of these vectors will be linearly independent, since Q d ,n is a 
prototype for any n-ic rf-fold. 

Proposition 6.7-2 Let A be an affine d-space, let d d;n : A — > Sym n (A)^ be 
the Veronese d-fold of degree n, and let P be a point in A. The hat that 
osculates 9 d . n to k th order at 0 d , n (P) = P n is P n - k Sym k (A) 1 , the hat of 
dimension ( k ~\f) — 1 that consists of all unit-weight multiples of P n ~ k . 

Proof The only subtlety, in comparison with the proof of the case d — 1 
in Proposition 6.3-1, is that we must consider approaching the point P in 
some arbitrary way, not necessarily along a straight line. Let b denote some 
vector in H d and let ||b|| be the norm of b in some fixed norm for H d - 
it doesn't matter which. We analyze the n th power (P + b • ip) n as ||b|| 
tends to zero, where b • ip denotes the vector b • ip :— b±(pi + • • • + b d <p d - 
Let a = (ao,a±, . . . ,a d ) denote a multi-index of nonnegative integers with 
|a| = n, and let a + := . . . , a d ) denote the dehomogenized version of a, 
with ao removed. By the Multinomial Theorem, we have 



(P + b • ip) n = () pa ° b a+ p a+ 

\a\=n 



E 

\a\=n 
ao>n—k 



P ao b a +p a + +0(||bf +1 ), 
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as ||b|| goes to zero. The sum here is a multiple of P n ~ k and also has weight 
1, since all terms have weight 0 except for P n . Thus, the flat P n ~ k Sym fc (A)-L 
osculates the Veronese ci-fold 9 d ^ n to k th order at P n . □ 

6.8 Tensor-product surfaces 

Despite the generality of the preceding section, there is yet another case to 
consider. So far, the degree bounds that we have been imposing are bounds 
on the total degree in all of the variables. Another option is to impose 
separate bounds on the degrees in disjoint sets of variables. In particular, 
the most common way to define a polynomial surface in CAGD is to impose 
separate bounds on the degrees of its defining polynomials in the two variables 
u and v. The resulting surfaces are the tensor-product surfaces, which can 
be thought of as curves of curves. 

For tensor-product surfaces, we decompose the parameter plane A as the 
product of two lines, say A = L x x L 2 , and we linearize each of those lines 
separately. Suppose that we choose C\ and <pi to be a center point and a unit 
vector for the line Li, while C 2 and ip 2 are the same for L 2 . Linearizing L\ 
gives us the linear 2-space L\ of anchors over L 1; where each such anchor pi 
can be written uniquely as a linear combination pi = W\(pi)Ci + 
The coanchors (wi,Ui) here are the basis for LI that is dual to the basis 
(Ci, (pi) for L\. All the same goes for L 2 . 

Both forms on L\ x L 2 and sites over L 1 x L 2 have separate degrees n\ and 
n 2 in the two parameters Li and L 2 , the pair (7115712) being called the bidegree 
of the form or site. Table 6.1 gives the formulas by which we shall denote the 
spaces of forms and sites of bidegree (ni; n 2 ) when characterized abstractly or 
when constructed by one of our three concrete constructions. On the first line, 
we are abstractly characterizing the algebrization of the linear space Li © L 2 
or L* © L\ using a universal mapping condition, as discussed in Chapter 9. 
On the second line, fixing our chosen bases, an (774; n2)-form on L\ x L 2 is a 
polynomial in the four variables w\, u\, w 2 , and u 2 that is homogeneous of 
degree Tii in w\ and u\ and separately homogeneous of degree n 2 in w 2 and u 2 . 
That space of polynomials is most simply written R ni . n2 [w 1 , Ui, w 2 , u 2 ]] but 
people who understand the tensor-product construction will see that it can 
equally well be written R ni [iui, u±] © TL n2 [w 2 , u 2 ], and that is how the term 
"tensor-product surface" arose. An (711; 7i2)-site over L\xL 2 is analogous, but 
with the anchor variables Ci, ipi, C 2 , and (p 2 . Moving to the third line, we can 
allow anchors or coanchors that are linearly dependent into our polynomials, 
as long as we realize that any given (77,1; n 2 )-site or (774; n 2 )-form will then have 
multiple, equivalent names, so we must mod out by an equivalence relation. 
The fourth line exploits duality to interpret each of those equivalence classes 
as a recipe for a real- valued function. We view an (ni;n 2 )-form as defining 
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space of (ni, n 2 )-sites space of (ni; n 2 )-forms 



Sym ni;n2 (Li©L 2 ) 

— Svm ( r,i) 6d Svm ( 


Sym ni;n2 (Lt©L*) 

— Svm fL^ (X) Svm CL^ 

°J m ni °J m ri 2 V 2) 


R-ni;n 2 [^1, { fl]C 2l (p 2 \ 

= R„, \C-\ , (fit] <E> Rn-JC?, 


~R ni . n2 [w 1 ,u 1 ;w 2 ,u 2 ] 

= R T11 [lUi , Mil (g> R„„ [lUo, ttol 


R-ni;n 2 [^i; ^2]/ ~(L i; L 2 ) 

(H-nil-^i]/ ~lJ ® (R n2 [L 2 ]/ ~^ 2 ) 


Rni;n 2 [^*; -^2]/ ~(L*;L*) 

(R ni [^]/^)®(R„ 2 [L*]/^) 


Bipoly ni;n2 (L* x L*,R) 

= Poly„ 1 (^R)®Poly n2 (L 2 ,R) 


Bi P ol yn i; n 2 (^l X L 2 ,R) 

= Poly ni (L 1 ,R)®Poly n2 (L 2 ,R) 



Table 6.1: Formulas for the space of sites or forms of bidegree (ni, n 2 ) 



a real-valued function of bidegree (ni,n 2 ) on L\ x L 2 , while an (ni;n 2 )-site 
defines a function on L\ x L^. But keep in mind that real- valued functions 
on coanchors don't have the obvious applications in CAGD that real-valued 
functions on anchors have. 

Given any site s over L\ x L 2 , we define the weight of s to be the real 
number s(wi,w 2 ) that results from evaluating s at the weight coanchors w\ 
and w 2 of L x and L 2 . That is, given any expression for s as a polynomial 
whose variables are anchors over Li or L 2 , we replace each anchor on Li 
by its weight we replace each anchor p 2 on L 2 by its weight w 2 (p 2 ), 

and we then simplify to get the weight s(wi,w 2 ). Going back to the second 
line in Table 6.1, if a site s of bidegree (ni,n 2 ) has been represented as a 
polynomial in R ni;n2 [Ci, ipi] C 2 , ip 2 ], then its weight is simply the coefficient 
of the term C™ 1 ^' 2 . 

Since L\ is an affine line, it is convenient to name the points on L\ using 
real numbers. But the same holds for L 2 , and we don't want to get the two 
lines confused; hence, we shall use two different accents. Let's denote by i 
the point t := Ci + tipi with coordinate t on the line L ± , while t := C 2 +tip 2 is 
the point with that same coordinate on L 2 . (I suggest reading the formulas 

\ y \ \ y y y 

t and t as "t in" and "t out".) For example, the formula 01234 denotes a 
site over L\ x L 2 of bidegree (2; 3), and the formula 0 2 0 3 denotes another 
such. In fact, we have 0 2 0 3 = CfCf. When an (nx;n 2 )-site over L x x L 2 is 
real-lineal, that is, splits as the product of n\ anchors over Li and n 2 anchors 
on L 2 , we'll typically write it with its L\ factors to the left of its L 2 factors, 
by convention. But we could equally well write the factors in any order. Like 
the algebra of sites over A, the algebra of sites over L\ x L 2 is commutative, 
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so xy = yx, for any real numbers x and y. 

If we didn't distinguish between points on L\ and points on L 2 using 
slanted accents, we would have to use some other technique to keep track of 
which points lay on which lines. For example, some authors would denote 
the (2; 3)-site 01234 as 01 <S> 234, where the points to the left of the symbol 
"(g)" are presumed to lie on Li, while those to the right lie on L 2 . With 
this notation, the sites t (8) 1 and 1 <g) t are distinct, the former being the 
point t on L 1: while the latter is the point t on L 2 . But don't be confused 
by this notation into thinking that the multiplication on sites somehow fails 
to be commutative. No matter how we write things, we still have xy = 
(x <g> 1)(1 <g) y) = (1 <g> y)(x <g> 1) = yx. 

More generally, the tensor-product construction combines ni-forms on an 
affine space A\ of dimension d\ with n 2 -forms on a space A 2 of dimension d 2 to 
produce tensor-product forms of bidegree (ni,n 2 ) on the product space A 1 x 
A 2 , that product space having (di,d 2 ) as its bidimension. In a similar way, 
it combines ni-sites over A\ with r^-sites over A 2 to produce (rii; n 2 )-sites 
over A\ x A 2 . Even more generally, we could consider triple tensor products, 
such as tensor-product forms with tridegree (ni,n 2 ;n 3 ) on a product space 
A\ x A 2 x A 3 of tridimension (di,d 2 ;d 3 ). But the most important case in 
CAGD is tensor-product surfaces, where the parameter space is the product 
L\ x L 2 of two lines. 



6.9 Tensor-product prototypes 

The theory of Veronese prototypes extends to the tensor-product case, with 
the help of another concept from algebraic geometry: the Segre embedding. 

Let's first consider the example of biquadratic tensor-product surfaces. 
The prototype of such surfaces is the surface a 2 - 2 that takes the point (Pi, P 2 ) 
in the product space L\ x L 2 to the site <J 2 - 2 (Pi, P 2 ) '■= P\P 2 , lying in the 
affine space (Sym 2 (Li) <8> Sym 2 (L 2 ))^ = R 2 - 2 [Ci,<fi;C 2 ,ip 2 ]^. That affine 
space is 8-dimensional, a typical element of it, a unit-weight (2; 2)-site s on 
Li x L 2 , being uniquely expressible in the form 

(6.9-1) s= C\C\ +s m ClC 2 ^ 2 +So2<3V! 

+si 0 Ciy?iCf +siiCiy?iC2y?2 +s 12 C 1 ip 1 ip 2 2 
+s 2 Q(p\Cl +s 2 i(flC 2 ip 2 +s 22 (f\(f\. 

The sites s in that 8-space that lie on the prototypical surface a 2 . 2 are those 
that factor as the product of two perfect squares: s = <y 2 - 2 {ui,u 2 ) = u\u\ = 
(C\ + u 1 Lp 1 ) 2 (C 2 + u 2 ip 2 ) 2 . When we use a biquadratic surface patch in one of 
our designs, say parameterized over the rectangle [a. .b] x [c. .d], we can view 
that patch as an affine transform of the prototypical patch a 2 . 2 ([a. .b] x [c. .d]). 
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And one convenient way to specify the instancing transformation that we 
intend is to specify the images of the nine sites (a 2 ~ l b l c 2 ~* d,i)o<ij<2, those 
nine images being the Bezier points of the resulting biquadratic patch. 

Viewed more abstractly, the prototype cr ni . n2 for tensor-product surfaces 
of bidegree (n^n?) can be thought of as cr ni;n2 = #(i ; i),(n i; n 2 ), a Veronese pro- 
totype for forms of bidegree (ni, 712) on a product space of bidimension (1; 1). 
The Segre embedding is a construction in algebraic geometry that is relevant 
here, since it lets us combine two Veronese prototypes, say 9d 1 , m and 0d 2 ,n 2 , 
into a tensor-product prototype 0^ u d 2 ),(n 2 ;n 2 )- The Segre embedding [30, 34] 
maps the product of (k\ — l)-space and (k 2 — l)-space into {k\k 2 — l)-space. 
For example, if A is an affine plane and B is an affine 3-space, so k\ = 3 and 
k 2 = 4, the Segre embedding maps A x B into an affine space of dimension 
3 - 4 - 1 = 11 by the rule 

A, x, y, z,\ 
((l,u,v), (l,x,y,z)) i-> j u, ux, uy, uz,\, 

\v, vx, vy, vz J 

where we have written all three weight coordinates as explicit l's to make 
the pattern clearer. Note that a point in the target 11-space, that is, a 3-by-4 
matrix of real numbers with a 1 in the upper-left corner, lies in the image of 
this Segre embedding just when that matrix has rank 1. 

A tensor-product Veronese prototype is the image, under the appropriate 
Segre embedding, of the Cartesian product of two separate Veronese pro- 
totypes. For example, suppose that we want to construct 6(dy,d 2 ),{n2;n2)i the 
prototype for forms of bidegree (ni;n 2 ) on a product space of bidimension 
{di;d 2 ). We begin with the separate Veronese prototypes 6di,m an d Qd 2 ,n 2 i 
which sit in affine spaces of dimensions ( ni ^ dl ) - 1 and { n2 ^ 2 ) - 1. Setting 
^ := ( n< +*) for % in {1,2}, we then use the Segre embedding with param- 
eters {k\, k<2) to embed the Cartesian product of those separate Veronese 
prototypes into an affine space of dimension k\ki — 1 = ( ni ^ dl ) ( n2 n 2 2 ) ~~~ 1- 



Chapter 7 



The Paired- Algebras 
Framework 

We have built the algebras of forms and sites as separate algebras, as shown 
in Figure 5.1. But the two algebras realize their full power only when we pair 
up the space Sym n (A*) of n- forms on A with the space Sym n (A) of n-sites 
over A, for each n, so that each can represent the dual of the other. For 
example, let P be a point in A and consider the evaluate-at-P functional 
on n-forms, that is, the linear functional ep that takes an n-form / as its 
argument and returns the real number ep(f) := f(P). Once we choose a 
pairing between n-forms and n-sites, we can represent that linear functional 
ep as a certain n-site. It turns out that there are two reasonable choices: 
Either e P = P n or e P = P n /n\ . 

Warning: We are about to take the only mathematical step in the entire 
construction of the paired algebras where there is a real choice about what 
to do. There are two candidate pairings between the spaces Sym n (A*) and 
Sym n (A), the summed pairing and the averaged pairing, and they differ by 
a factor of n! . Adopting the summed pairing leads to an annoying factor 
of n! in any formula that evaluates an n-ic — for example, the denominator 
in the formula ep = P n /n\ . But it leads to simple, powerful, and familiar 
formulas for differentiation. Adopting the averaged pairing would simplify 
evaluation a bit, at the price of complicating differentiation a lot. I argue 
in Appendix B that the summed pairing is the wiser overall choice, and 
this monograph follows my advice. I hope that other researchers in CAGD 
will find my arguments convincing, lest we all find ourselves bedeviled by 
conflicting conventions. Sad to say, there is no way to make the formulas for 
evaluation and for differentiation both come out pretty. Indeed, evaluating 
an n-form is essentially the same process as differentiating that form n times 
in the same direction — except that the latter result exceeds the former by 
that annoying factor of n! . 

This controversy about how to scale the pairing maps is unfortunate, 
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Sites Forms 

Figure 7.1: The paired-algebras framework 
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but don't be too downcast: The controversy concerns only a numeric factor. 
The deep benefit of the pairing maps is that they allow us to exploit the 
multiplication in the algebra of sites as a new tool with which to study linear 
functionals on forms, and vice versa. For example, the formula ep = P n /n\ 
tells us that the evaluate-at-P functional ep is a perfect n th power. So, in 
the linear space Sym n (A) of n-sites, what is the geometry of those n-sites 
that represent point evaluations? Answer: Except for the annoying factor of 
n\ , that set is precisely the Veronese ci-fold of parametric degree n, the set 
6 d>n (A) of all perfect n th powers of points. 

7.1 Picturing our goal 

Figure 7.1 shows our final goal at last, the paired-algebras framework. The 
double-headed arrow on the n th level denotes the pairing that we shall adopt 
between the spaces Sjm n (A) and Sjm n (A*), thereby allowing us to use each 
to represent the dual of the other. 

The issue about which pairings to adopt, the summed or the averaged, 
leaves the lower levels in Figure 7.1 somewhat fuzzy. But note that there is 
no issue about levels 0 and 1. On level 1, we want the fundamental pairing 
between the spaces A and A* of anchors and coanchors. On level 0, we 
want the pairing that combines two real numbers by multiplying them. The 
annoying factor of n\ becomes an issue only once n exceeds 1. 

Math remark: In the paired- algebras framework, the linear space Sym n (A)* 
of dual functionals on n-sites is represented by the space Sym n (A*) of n-forms, 
and the same is true with forms and sites reversed. Is a similar representation 
possible for the algebras in their entirety? The whole algebra of sites Sym(A) 
is also a linear space, albeit of infinite dimension, so it has a dual Sym(A)*. 
Can we use the whole algebra of forms Sym(A*) to represent that dual, all 
at once? That is, can we combine all of the separate double-headed arrows 
in Figure 7.1 into one fat double-headed arrow? 

No, because of the blow-up in dimension that happens when we take 
the dual of an infinite-dimensional space. The dual space Sym(A)* is huge. 
We can think of an element F in that dual space as a sum F = J2 n>0 f n , 
where each /„ is an n-form, but with no requirement that all but finitely 
many of the (/„) must be zero. Instead, all of the (f n ) may be nonzero 
simultaneously. The infinite sum F still determines a linear map from sites 
to real numbers as follows. Any site s can be uniquely expanded as a sum 
s = J2 n>0 s n of its graded components (s n ), where s n is an ra-site and where 
all but finitely many of the (s n ) are zero. So we can define F(s) by the rule 
F( s ) := En>o(/i) s «)' an d the resulting sum of real numbers will always be 
a finite sum. But the algebra of all such infinite sums F = J2 n>0 f n is vastly 
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larger than the algebra of forms, in which only finitely many of the graded 
components (/„) are allowed to be nonzero. 

Bourbaki [4] introduces the concept of the graded dual of a graded algebra, 
which is the direct sum of the duals of the graded slices. By exploiting that 
concept, we could combine all of the double-headed arrows in Figure 7.1 into 
one fat arrow: a fat arrow asserting that each of the graded algebras Sym(A) 
and Sym(A*) can represent the graded dual of the other. 



7.2 Lineals and perfect powers 

To prepare for defining the pairing maps, we go over some easy lemmas about 
the algebras of forms and sites. 

Lemma 7.2-1 Given any afhne space A of finite dimension d := dim(A) 
and any nonnegative integer n, every n-form on A is a linear combination of 
real-lineal n- forms. The same goes for sites over A. 

Proof Fix some basis (co, . . . , q) for the space A* of coanchors on A. As 
on the second line of Table 5.1, we can then concretely construct the linear 
space Sym n (A*) of n-forms on A as the space R„[co, . . . , of all polynomials 
that are homogeneous of degree n in the variables (co, . . . , cA. 

Let a denote a multi-index a = (ao, . . . , a^), where each a^ is nonnegative 
and where \a\ := ao + • • • + satisfies \a\ = n. We then denote by c a the 
n-form c a := Cq° • • • c a d d . The n-forms (c a )| a | =n form a basis for the space 
R„[co, . . . , ca] of n-forms; let's call it the monomial basis. 

Every n-form on A is a linear combination of the ( n ^ d ) monomials in this 
basis. And each of those monomials is clearly real-lineal — indeed, splits as 
the product of n elements of our chosen basis (cq, . . . , q) for A*. So every 
n-form is a linear combination of real-lineal n-forms. □ 



In fact, more is true. 



Lemma 7.2-2 Given any affine space A of finite dimension d := dim(A) 
and any nonnegative n, every n-form on A is a linear combination of n-forms 
that are perfect n th powers. Again, the same goes for sites over A. 

Proof We know from Lemma 7.2-1 that every n-form is a linear combination 
of real-lineal n-forms. So it suffices to prove that every real-lineal n-form is 
a linear combination of perfect n th powers. 

Let / = hi • • ■ h n be a real-lineal n-form on A. In the case n = 2, we have 

(7.2-3) h^-^ + W 
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This is quadratic case of the Polarization Identity. An inclusion-exclusion 
argument establishes the general case 



which expresses the product hi ■ ■ ■ h n as a linear combination of 2™ — 1 perfect 
n th powers. □ 

It might seem more natural to require the set S to be nonempty in the 
sum in Equation 7.2-4, since the term resulting from S = 0 is zero in any 
case; that's why there are only 2 n — 1 terms, instead of 2 n . But the resulting 
term is actually 0™, which is 0 when n is positive, but is 0° = 1 when n — 0. 
And that 1 is needed to make Equation 7.2-4 correct in the case n — 0. 

Math remark: Both Lemma 7.2-1 and Lemma 7.2-2 hold equally well in the 
algebrization of any linear space X; they don't use any special properties of 
the spaces A* or A of coanchors or anchors. Indeed, those lemmas hold even 
when X is infinite dimensional, with the same proofs. But Lemma 7.2-2 does 
not hold over fields of finite characteristic, because of the division by n\ . 

In the case of sites, Lemma 7.2-2 can be strengthened a bit further still. 

Lemma 7.2-5 Given any affine space A of finite dimension d := dim(A) 
and any nonnegative n, every n-site over A is a linear combination of n-sites 
that are n th powers of points — that is, n th powers of anchors of unit weight. 

Vacant remark: We make the convention that the real number 1 is the 0 th 
power of a point. This convention isn't controversial when d > 0, since 
P° = 1, for all points P in A. When d = —1, however, there are no points 
P in A to raise to the 0 th power. We argue that 1 is a 0 th power of a point 
anyway, since we can write 1 as a product of 0 factors, a product in which all 
factors are equal and all factors — of which there aren't any — are points. 

Proof When n = 0, an n-site over A is a real number and hence a scalar 
multiple of 1, which is a 0 th power of a point by the convention that we just 
adopted. So we may suppose that n is positive. 

Since (tp) n = t n p n for any real number t and anchor p, it suffices to 
show that every n-site is a linear combination of n th powers of anchors 
whose weights are not zero. To see this, choose a bary centric reference frame 
(Ro, . . . , Rd) for A, consisting entirely of points. Every n-site is then a linear 
combination of the monomials {R a )\ a \= n , which form a basis for the space 
Sym n (A) of n-sites. When we apply Equation 7.2-4 to such a monomial, all 
of the n th powers that arise will have the form {(5qRq + • • • + j3dRd) n , for some 



(7.2-4) 



■n 



1 
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multi-index j3 with \f3\ < n. If \f3\ — 0, then we are talking about 0 n , which 
is zero since n is positive. So we end up with n th powers of anchors whose 
weights are positive integers. □ 

Let's use (3 ■ R to denote the sum (3 ■ R := (3oRo + • • • + (3dRd- We saw, 
in proving Lemma 7.2-5, that the perfect n th powers ((f3 • R) n )\p\< n span the 
entire space Sym n (A) of n-sites. We shall show, in Proposition 8.4-1, that 
the subset (((3 ■ R) n )^ =n actually forms a basis. 

Exercise 7.2-6 Equation 7.2-3 expresses the product h\h2 as a linear com- 
bination of three perfect squares; but the similar identity 

, , + h 2 f - (h, - h 2 f 
h\i\2 — 

uses only two squares, which is clearly the fewest possible. By generalizing 
this latter identity, write the product h\ • ■ ■ h n as a linear combination of only 
2^-1 p er f ec t n th powers. 
Answer: We have 

' SUT={2,...,n} ^ iSS 1 jeT ' 

snT=0 

Could we get by with even fewer than 2 n_1 perfect n th powers? In the case 
n — 3, it turns out that four cubes are necessary; a Grobner-basis calculation 
with Maple [9] or Singular [27] establishes that it is impossible to write the 
product xyz as the sum of three terms, each of which is the cube of a linear 
combination of x, y, and z. But I don't know whether eight fourth powers 
are necessary, when n = 4; perhaps fewer would suffice? 

7.3 The Permanent Identity 

With those lemmas under our belt, it is time to define the pairing, for each n, 
between n-forms on A and n-sites over A. That is, we want to pair the linear 
spaces Sym n (A*) and Sym n (A). Those two linear spaces have the same finite 
dimension; so there are lots of pairings between them. The key to a useful 
theory is to find a pairing that interacts well with the multiplications in the 
algebras of forms and sites. 

By Lemma 7.2-1, defining such a pairing on n-forms and n-sites that 
are real-lineal would suffice to define it everywhere. So let's think about 
an n-form / = hi ■ ■ ■ h n that is the product of n coanchors and an n-site 
s — Pi • • ■ p n that is the product of n anchors. What value should we assign 
to (/, s) = (hi ■ • • h n ,pi • • -p n )l One value that might be relevant is the 
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product (hi,pi)(h2,P2) • • • (h n ,p n ). But the order in which we numbered the 
n factors hi through h n of / was arbitrary, and the same for s; so there is no 
reason why matching hi with p iy for i from 1 to n, makes more sense than 
any other way of matching up the n coanchors with the n anchors. To be 
symmetric, we should consider all possible such matchings and then — and 
here is where the n\ issue arises — we either sum or average the results. 

Let's say that a pairing map between n-forms and n-sites satisfies the 
Summed Permanent Identity when, for all coanchors hi through h n and for 
all anchors pi through p n , we have 

(7.3-1) (hi ■ ••hmpi • • -p n ) = ^ J } (h k ,p v (k)), 

i^es„ fce[i..n] 

where the summation index v varies over the symmetric group S n of all n! 
permutations of the integers from 1 to n. The Averaged Permanent Identity 
is the same, except that we divide the sum by n! : 

(7.3-2) (hi ■ ■ -h n ,pi ■ ■ -p n ) = — } ^2 II ( h k,Pu(k))- 

' ves„ ke[i..n] 

The term "permanent" is appropriate because that sum is the permanent of 
the n-by-n matrix whose (i, j) th entry is (hi,pj). Recall that the permanent 
of a matrix is like the determinant, except that all products are added; in 
the determinant, of course, products from even permutations v are added, 
but those from odd permutations are subtracted. * In the next section, we 
show that there exists a unique pairing between n-forms and n-sites that 
satisfies the Summed Permanent Identity; dividing that pairing by n! gives 
the unique pairing that satisfies the Averaged Permanent Identity. 

When we wrote the Permanent Identities, we used angle brackets both 
for the pairing on the left-hand side, the new pairing between n-forms and 
n-sites, and also for the pairing on the right-hand side, the fundamental 
pairing between coanchors and anchors. In the particular case n — 1, that 
could potentially lead to confusion. Fortunately, both of the Permanent 
Identities reduce, in the case n = 1, to the identity (hi,pi) = (h x ,pi). Since 
the new pairing is thus required to agree with the old wherever the old is 
defined, it causes no confusion to use the same angle brackets for both. 

Exercise 7.3-3 What values do the two Permanent Identities mandate for 
the pairing value (/, s) in the case n = 0, when the 0-form / and the 0-site 
s are simply real numbers? 

^Math remark: The permanent arises because we are studying the symmetric algebra 
Sym(X). The analogous formula for the alternating algebra Alt(X) has the determinant 
instead. That explains why the alternating algebra is appropriate for multivariate calculus, 
where the determinant of a Jacobian measures the ratio of two signed volumes. 
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Answer: The sum has a single term, and that single term is an empty 
product; so both of the Permanent Identities mandate that (1, 1) = 1. By 
bilinearity, it follows that (/, s) = fs, for all real numbers / and s. 

We started this section by using Lemma 7.2-1 to restrict our attention to 
n-forms and n-sites that are real-lineal. By using the stronger Lemma 7.2-2, 
we could have gone further and restricted our attention to n-forms and n-sites 
that are perfect n th powers. What do the Permanent Identities say about 
that special case? When h± — • • • — h n — h and p 1 = ■ ■ ■ = p n — p, 
the Summed Permanent Identity mandates that (h n ,p n ) = n! (h,p) n , the n! 
arising from the n! different ways of matching up the n identical h's with 
the n identical p's. The Averaged Permanent Identity divides out that n! , 
giving the simpler formula {h n ,p n ) = {h,p) n . 



7.4 Defining the pairing 

Recall, from Section 2.3, that every bilinear form B: X x Y — > R has an 
associated matrix M, under the convention that the scalar B(x,y) is given 
by the matrix product x t My. Furthermore, the bilinear form B is a pairing 
just when its matrix M is invertible. 

Proposition 7.4-1 Let A be an affine space of finite dimension d := dim(A), 
and let n be nonnegative. There is a unique pairing between the space 
Sjm n (A*) of n-forms on A and the space Sym n (A) of n-sites over A that 
satisfies the Summed Permanent Identity. We christen it the summed pairing. 
Dividing the summed pairing by n\ gives the averaged pairing, the unique 
pairing that satisfies the Averaged Permanent Identity. 

Proof We construct the spaces Sym n (A*) and Sym n (A) concretely by fixing 
bases. Let (do, • • • , a<j) be some basis for the linearized space A of anchors 
over A, and let (cq, . . . , cA be the basis for the space A* of coanchors on A 
that is dual to (a 0 , . • . , aA. The duality constraints tell us that 

/ \ / 1 if 1 = i 

(Q, CLj) = < 

I 0 otherwise. 

We can then represent the space Sym n (A*) of n-forms as the space of poly- 
nomials R„[co, . . . , Cd], while the space Sym n (A) of n-sites is R„[ao, . . . , a<j]. 
Note that both spaces have dimension 

Let 7 denote a multi-index 7 = (70, . . . , 7^) with I7I := 70 + • • • + 7d = n. 
The n-forms (c 7 )| 7 | =ra form the monomial basis for the space R ra [co, . . . , q] 
of n-forms. In a similar way, the n-sites (a a )\ a \ =n form the monomial basis 
for the space R n [ao, ■ ■ ■ ,a>d] of n-sites. In order to define any bilinear map 
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B: R„[co, . . . ,Cd] x R n [ao, • • • , ad] —> R, it suffices to specify the values of 
B on those n-forms and n-sites that are monomials, that is, to define the 
{ n+ n)- h y-{ n+ n) matrix of real numbers (B(c> , a a )) hl=H=n . For the map B 
to be a pairing, that matrix must have full rank. 

Each n-form c 7 and each n-site a a is real-lineal, so the Summed Perma- 
nent Identity 7.3-1 leaves us no choice about how to fill in the matrix of 
B. What value does it mandate for (c 7 ,a a )? If the multi-indices 7 and a 
are distinct, then all n! ways of matching the n coanchor factors with the n 
anchor factors will involve at least one match-up (cfc,aj) with k ^ I. So all 
n! terms in the resulting sum will be zero; for 7 ^ a, we have (c 7 , a a ) = 0. 
What about when 7 = 0;? To get a nonzero term in the sum, we must match 
up, for each k from 0 to d, the 7^ copies of c& in c 7 with the 7^ = copies 
of 0^ in a 7 , and we can do that in 7J ways. So the number of nonzero 
terms is the product 7 0 ! 71! • • -7^! , which we shall abbreviate as 7! . Since 
each nonzero term contributes 1, we conclude that the Summed Permanent 
Identity mandates: 

(c 7 ,0 = ( 7! if7 = a 
I 0 otherwise. 

The matrix that we have just constructed is diagonal, with all of its di- 
agonal entries nonzero; so it has full rank. We conclude that there exists 
a unique pairing that satisfies all monomial instances of the Summed Per- 
manent Identity, that is, all instances in which each coanchor factor hk lies 
in our chosen basis (co, . . . , Cd) for A* and each anchor factor pk lies in our 
chosen basis (a 0 , . . . , a d ) for A. 

It remains to verify that this unique pairing in fact satisfies all instances 
of the Summed Permanent Identity. To see that, note that both sides of 
the Summed Permanent Identity are (2n)-linear functions from (A*) n x (A) n 
to the reals; that is, both sides vary linearly as a function of each hk if the 
other h's and all of the p's are held fixed, and the same for each p k . Thus, 
the validity of the Summed Permanent Identity extends by linearity from 
monomial instances to all instances. □ 

The matrix that we construct in this proof has zeros off the diagonal and 
has the positive integer 7! in the (7, 7) slot on the diagonal. Once n > 2, the 
diagonal entries are not all ones, and that engenders a warning. We chose 
the bases (c 0 , . . . , c d ) and (a 0 , . . . , a d ) for the spaces A* of coanchors and A 
of anchors to be dual to each other. But the monomial bases (c 7 )| 7 | =n and 
(a a )\ a \ =n that we then constructed for the spaces R„[co, . . . , c d ] of n-forms and 
R n [ao, . . . ,ad] of n-sites are not dual to each other — that is, they are not 
dual under the unique pairing that satisfies the Summed Permanent Identity. 
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Math remark: The existence of diagonal entries 7! that exceed 1 makes the 
theory of symmetric algebras somewhat subtle. Among other things, over a 
field of prime characteristic, the analog of the summed pairing may fail to be 
a pairing; once n is at least the characteristic, the matrix that we get from 
the Summed Permanent Identity has zeros on the diagonal, and hence fails 
to have full rank. Alternating algebras are better behaved in this respect. 
In the alternating algebras Alt (A) and Alt (A*), the analogs of the monomial 
basis elements are a i± A ■ ■ ■ A a in and c ix A • • • A c in for i\ < ■ • • < i n , with no 
repeated factors allowed. So the Summed Determinant Identity produces a 
matrix in which all of the diagonal entries are ones. 

Averaging, rather than summing, divides everything by n\ , so the (7, 7) 
slot on the diagonal is 7!/^! . Recall that 7! divides n! , for all 7 with I7I = n, 
since the quotient (™) := 77J/7! is a multinomial coefficient, and hence an 
integer. So the diagonal entries 7!/^! = 1/ (") are reciprocals of integers. 

It's bad news that the diagonal entries of these matrices are not all equal. 
But it's good news that all of the off-diagonal entries are zero. It follows that 
the dual of a monomial basis differs only by some factorial scale factors from 
being itself a monomial basis. 

Proposition 7.4-2 Given a d-dimensional affine space A, let (a 0 , • • • , ad) 
and (co, . . . , q) be dual bases for the spaces A and A* of anchors and coan- 
chors, and consider the corresponding monomial bases (a Q )| a |=n and (c 7 )| 7 | =n 
for the spaces Sjm n (A) and Sjm n (A*) of n-sites and n- forms. Under the 
summed pairing, the dual of the basis (a a )\ a \ =n is the scaled monomial ba- 
sis (c 7 /7! )\-y\= n - Alternatively, putting the scale factors on the other side, 
the dual of the basis (c 7 )| 7 | =n is the basis (a a /a\ )\ a \= n - Under the averaged 
pairing, the scale factors are larger by a factor of n\ ; so the dual of (a Q ) is 
( p) c 7 ) and the dual of (c 7 ) is ( (™) a Q ) . 

Proof In proving Proposition 7.4-1, we saw that the value (c 7 , a a ) given by 
the summed pairing is 

<c 7 ,0 = { 7! ' lil = a 
I 0 otherwise. 

Thus, to end up with dual bases, it suffices to divide one basis or the other 
by the factorial of its multi-index. Under the averaged pairing, we must also 
multiply one basis or the other by n! . □ 

7.5 Summing is better — trust me 

We have reached an unpleasant juncture. We want to pair n-forms with 
n-sites, and we have found two candidate pairings, differing by a factor of n\ . 
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Perfectly adequate theories can be erected based on either of those pairings, 
and reasonable people could prefer either theory. 

So far, we have been laying the groundwork for the two theories in parallel. 
If we continue to develop the two in parallel, however, we shall be faced with 
two versions of every new formula, which seems a recipe for confusion. Better, 
instead, to develop one theory in isolation. We can return later to discuss 
how the other theory would differ from the one that we then understand. 

Until further notice, therefore, we are going to pair n-forms with n-sites 
using the summed pairing, the unique pairing map that satisfies the Summed 
Permanent Identity 7.3-1. We won't reopen the summing- versus-averaging 
debate until Appendix B, where we analyze how all of our formulas would 
change if we averaged, instead of summing. Some things would get prettier, 
others would get uglier. But Appendix B argues that, in the context of 
CAGD, summing beats averaging overall. 

Unfortunately, the costs of summing show up before its benefits; that 
is, summing clutters up some formulas that you learn right away, thereby 
enabling some formulas that you don't learn until later to be cleaner. So 
we are going to run across annoying factors of n\ quite soon. Please grant 
summing the benefit of the doubt until Appendix B. 

7.6 Evaluating an n-form 

Having chosen the summed pairing, we now have a slew of formulas to cover, 
formulas that relate pairing to other operations on forms and sites. The first 
of those operations is evaluation, and the basic rule for evaluation is this: 

To evaluate an n-ic under the summed pairing, pair it with an 
n th power and divide by n\ . 

Proposition 7.6-1 If f is any n-form on an affine space A and if P is any 
point in A, evaluation of f at P is related to pairing by the formula 



More generally, the same formula holds with the point P replaced by any 
anchor p over A: 



Dually, if s is any n-site over A and h is any coanchor on A, we have 



(7.6-2) 




(7.6-3) 




(7.6-4) 



s(h) = (h n /n\,s). 
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Proof Lemma 7.2-2 tells us that every n-form on A is a linear combination 
of perfect n th powers. Since evaluation at a fixed point P is a linear process, 
it suffices to prove Formula 7.6-2 when / = h n is the n th power of some 
coanchor h. In that case, the left-hand side is h n (P) = (h(P)) n = (h,P) n . 
On the right-hand side, the Summed Permanent Identity 7.3-1 tells us that 
(h n ,P n /n\) = (h n ,P n )/n\ = (n! (h,P) n )/n\ = (h,P) n . The same proof ap- 
plies to any anchor p over A, and the proof of the dual result is symmetric. □ 

This correspondence between evaluating an n-ic and pairing it with an 
n th power gives us a new perspective from which to view Lemma 7.2-5. That 
lemma tells us that every n-site is a linear combination of perfect n th powers 
of points. So, if we know the value (/, P n ) for every point P in A, we can 
compute the value (/, s) for any n-site s, which determines the n-form / 
completely. That is no surprise, since knowing the value (/, P n ) is the same 
as knowing the value f(P) = (f,P n /n\); and an n-form / is determined 
by its values f(P) at all points P. Indeed, the process of determining the 
polynomial / from a sufficiently large and sufficiently independent set of its 
values f(P) is the familiar process of polynomial interpolation. 

Back in Chapter 4, when we adopted the homogenized framework, we 
noted that it makes sense to evaluate an n-form at an anchor that isn't a 
point. What does that process mean geometrically? There are two cases. 

Consider first an anchor p whose weight w(p) is nonzero. Such an anchor 
is a scalar multiple of a point; so we have p = w(p)P, where the point P 
is given by P := p/w{p). Since an n-form / is homogeneous of degree n, it 
follows that f(p) = f(w(p)P) = w(p) n f(P). 

The remaining case is more subtle: the case of a vector it over A. What 
is the value /(7r)? It turns out that 

(7.6-5) /(tt) = —(D n ) n f. 

n! 

That is, evaluating an n-form at the vector tt is the same as differentiating 
that n-form n times, each time in the direction of the vector it — and then 
dividing by n! . Note that the n th derivative of an n-form is a constant, so 
the right-hand side of Formula 7.6-5 needs no further evaluation. We won't 
pause to verify Formula 7.6-5 now, since it will follow easily once we can 
differentiate, as well as evaluate, by pairing with an appropriate site. But 
Formula 7.6-5 should at least seem plausible, on the following grounds. A 
vector is a scaled version of a point at infinity. So evaluating an n-form at 
a vector means finding out the leading term of what happens as we go to 
infinity in that direction. Fix any point Q in A and consider the function 
g (t) := f(Q+tn). Taylor's Theorem tells us that g(t) = Eo<fe<n 9 {k) (0)t k /k\ . 
As t tends to infinity, the dominant term is the last, in which the coefficient 
of t n is g( n \0)/n\ = ((D n ) n f)/n\ = f(n). Thus, the value of an n-form at 
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a vector tells us, in a natural way, about what happens as we go to infinity 
in the direction of that vector. Note also that the denominator of n! in 
Formula 7.6-5 is closely related to the denominators in a Taylor series. 

So, evaluating an n-ic is essentially the same process as differentiating it 
n times, always in the same direction — the same process, that is, except 
for the annoying factor of n\ . Our choice of the summed pairing, over the 
averaged pairing, means that our formulas for evaluation are cluttered with 
factors of n! , while our formulas for differentiation, coming soon, are pretty. 
For the tradeoff between the two pairings, see Appendix B. 

7.6.1 Evaluating the blossom of an n-form 

Blossoming replaces n-ic dependence on a single parameter with n-affine 
dependence on n symmetric parameters. By exploiting the paired algebras 
of forms and sites, we have learned how to represent n-ic dependence on 
a single parameter p as the process of pairing with the n-site p n /n\ . This 
makes it trivial to blossom: We merely pair, instead, with p\ ■ ■ •p n /n\ . 

Proposition 7.6-6 Let / he any n-form on an afGne space A, let f: A n — > R 
be its multiaffine blossom (a.k.a. polar form), and let Pi through P n be any 
points in A. We then have 

(7.6-7) f(P 1 ,...,P n ) = (f,P 1 ---P n /n\). 

The blossom f extends uniquely to a multilinear function f : A n — > R, which 
satisfies 

(7.6-8) f(p u ...,p n ) = (f,pi-- -Pn/n\ ). 

for all anchors p\ through p n on A. 

Proof The product p 1 • • -p n is a linear function of each factor, is symmetric, 
and reduces to the n th power p n when p 1 — • ■ • — p n — p. □ 

7.7 Formulas for differentiation 

Relating differentiation to pairing is more subtle, because differentiating an 
n-form produces an (n — l)-form, rather than a scalar. Here is the basic 
story: Differentiating an n-form / in the direction of a vector n corresponds 
to setting to it one of the n factors of the n-site with which / eventually 
gets paired. The other factors of that n-site are set only later, when the 
derivative D n f is itself evaluated or further differentiated. This process is 
best understood from a few examples. 
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Proposition 7.7-1 Let f be an n-form on the afEne space A, let tt be a 
vector over A, and let R be a point in A. Differentiating f in the direction tt 
and then evaluating the resulting (n — l)-form at R is related to pairing by 
the formula: 

(7.7-2) D 7r f(K) = (f,7TR^ 1 /(n-l)\). 

Proof Formula 7.6-2 tells us how to evaluate by pairing, so we can simply 
calculate: 



t^o t 

(/, (R + titT/n\)-(f,R n /n\ 



= lim 

t^o t 

= lim(/, (R + t7r) n -R n )/n\t 

= lim (/, {R n + ntTrR 71 ' 1 + 0{t 2 )) - R n )/n\ t 



lim(/, ntTvR* 1 - 1 + 0(t 2 ))/n\t 
(/, TriT-Vtn-l)!). 



Note that i?™" 1 / (n — 1)! is the site with which we would pair the (n — l)-form 
D w f, in order to evaluate it at the point R. So the differentiation merely sets 
to tt one of the factors of the n-site with which / eventually gets paired. □ 

The fact that n is a vector, that is, that its weight is 0, plays no role 
in that proof. Indeed, we shall use the standard limit formula to define the 
derivative of a n-form / on A in the direction of any anchor p over A: 

p v ' t^o t 

With this definition, Formula 7.7-2 extends from vectors to arbitrary an- 
chors; we have D p f(R) = {f^pR n ^ 1 /{n — 1)!). But we have to be a bit 
careful. In some contexts, the directions in which it is legal to differentiate 
are restricted to be vectors. For example, when we say that two forms / and 
g, possibly of different degrees, agree to k th order at a point R, we are saying 
that D ni ■ ■ ■ D nj f(R) = D 1Tl • • • D^^g^R), for any j < k and any vectors i\\ 
through ii j. But only vectors are permissible as directions in this context, 
not arbitrary anchors, as discussed in Section 7.11. 

Differentiating multiple times is an easy generalization. 

Proposition 7.7-3 Let f be an n-form on the afEne space A and let p\ 
through p k and r be anchors over A, for some k < n. Taking f and differ- 
entiating k times, in the directions of the anchors p\ through p k , and then 
evaluating the resulting (n — k)-form at r is related to pairing by the formula 

(7.7-4) D P1 ... D Pk f(r) = (f, Pl ■ ■ -p k r n - k /(n -k)\). 
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Proof The argument for k = 2 should make the pattern clear: 

D g f(r + tp)-D g f(r) 



D p DJ(r) = lim 



t^o t 

(/, q(r + tpY^/in -!)!)-(/, qr n ' l /{n - 1)! 



= lim 



t^o t 



lim (/, q(n - l)r n ~> + 0(t 2 ))/(n -l)\t 



t->o 



= (f,pqr n - 2 /(n-2)\). □ 



Each differentiation thus sets one factor of the n-site s with which the 
n-form / eventually gets paired. Differentiating k times leaves us with an 
(n — fc)-form, which we can then evaluate at an anchor r by setting the 
remaining n — k factors of s to r n ~~ k /(n — k)\ . If we differentiate n times, 
we get the constant D Pl ■ ■ ■ D Pn f(r) = (f,pi ■ ■ -p n ), independent of r. If we 
further specialize to the case p\ — ■ • ■ — p n — p in which all n directions are 
the same, we find that (D p ) n f = (f,p n ). Since the rule for evaluation at p 
is f(p) — (f,P n / n ^ )> we see that the relationship between evaluating an n-ic 
and differentiating it n times is indeed as we claimed in Formula 7.6-5. 

For the record, here is the formula for differentiating an n-form k times 
and then evaluating the blossom of the resulting (n — fc)-form at the anchors 
r\ through r n _fc: 

(D P1 ■ ■ ■ D Pk f)~(r u . . .,r n _ fc ) = (f,p 1 ■ ■ -p k r x ■ • -r n _ fc /(n - k)\ ). 



7.8 The contraction operators 

If we set k of the factors of the n-site with which an n-form will eventually 
get paired, we have essentially converted that n-form into an (n — fc)-form. 
The operator that does that conversion is called contraction. 

Let A be an affine space, let / be an n-form on A, and let s be a fc-site 
over A, where k < n. In the special case k = n, we know how to combine 
/ with s to produce a real number: the pairing value (f,s). When k < n, 
we can't get a real number. But we can produce, from / and s, a mapping 
that takes (n — fc)-sites to real numbers: the mapping t h- > (f,st), for any 
(n — /c)-site t. This mapping is an element of the dual space Sjm n _ k (A)* , 
which we are representing as the space Sym n _ fc (A*) of (n — fc)-forms. Thus, 
the n-form / and the fc-site s together determine an (n — fc)-form, which is 
written / 1_ s and called the contraction of f on s or the s-contraction of f. 
The terms "internal product" and "inner product" are also used. Note that, 
in the expression /ls, the vertical stroke of the operator symbol is next to 
the operand of higher degree. 
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Formally speaking, the contraction / 1_ s of / on s is completely defined 
by the equation 

(7.8-1) (f^s,t) = {f,st). 

Intuitively, contraction is a flavor of partial evaluation. We can think of 
our n-form / as a function that accepts an n-site as its input and returns a 
real number. When we contract / on a fc-site s, we are declaring that we 
are interested in the values of that function only on those n-sites that are 
multiples of s. 

In the special case k = n, contracting an n-form / on a fc-site s results in 
a 0-form f^-s, that is, in a scalar. By setting t in Equation 7.8-1 to be the 
0-site t := 1, we find that (f*-s, 1) = (/, s- 1) = (/, s). Since pairing a 0-form 
with a 0-site simply multiplies the two scalars, as discussed in Exercise 7.3-3, 
we conclude that f\-s— (/i_s, 1) = (f,s). Thus, when k = n, contraction 
reduces to pairing. 

It is convenient to extend the contraction operator f\-s to the case k > n 
by setting f\-s = 0. To support this, we make the convention that 0, which we 
have already agreed is an m-form for every nonnegative to, is also an m-form 
— in fact, is the unique m-form — when to = n—k is negative. Extending the 
contraction operator in this way makes the value f\-s well-defined whenever 
/ and s are homogeneous, whatever their degrees. We further extend to 
those cases where the arguments / and s are inhomogeneous in the unique 
way that preserves linearity. Having done so, the site f\-s is now well-defined 
for any form / and any site s — even inhomogeneous ones. 

Successive contractions commute with each other. Indeed, we have the 
identity (/i_s)i_i = (/i_i)i_s = f\-(st). When /, s, and t, are all homogeneous, 
this follows because all three expressions denote the unique form of degree 
deg(/) — deg(s) — deg(t) that, when paired with any site u of that degree, 
returns the real number (f,stu). When /, s, or t are inhomogeneous, the 
result follows by linearity. 

Just as we can contract a form on a site, we can contract a site on a form. 
If s is an m-site and g is a fc-form, the expression g-is denotes an (to — &)-site 
called the g-contraction of s or the contraction of s on g. It is the unique 
(to — fc)-site that makes (/, g-*s) = (fg, s), for all (to — A;)-forms / on A. We 
extend this dual contraction operator also to return zero when k > to, and 
we further extend it by linearity to the inhomogeneous case. 

If / is an n-form and s is an m-site, don't get the two contractions / l. s 
and / -is confused. The operator with its vertical bar on the left, the form 
side, produces an (n — m)-form, while the one with its vertical bar on the 
right, the site side, produces an (to — n)-site. If n and to are distinct, at least 
one of the two results will have negative degree and hence will perforce be 
zero. If n = to, we have f\-s — f-is — (f,s). 
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7.9 Differentiation as contraction 

We are interested in contractions primarily because they give us a more 
concise way to write down the rule for how to differentiate in the paired 
algebras. In particular, Formula 7.7-2 tells us that 

D P f(r) = (f,pr n ' 1 /(n-l)\} = (/ i_p, ^/(n - 1)! ). 

Since this holds for all anchors r, we conclude that the (n — l)-form D p f 
coincides with the contraction f\-p; that is, we have the simpler formula 

D P f = f^p. 

Rephrasing that in English, we finally have a rule for differentiation that is 
worthy to stand alongside our evaluation rule: 

To differentiate under the summed pairing, simply contract. 

This rule is deliciously simple; in particular, note that the degree of the 
form being differentiated is irrelevant. Indeed, the formula D p f = f\-p 
holds even for forms / that are inhomogeneous. That delicious simplicity is 
the reward that we have earned by tolerating the annoying factor of n\ in 
our evaluation rule. The rules for evaluation and differentiation under the 
averaged pairing are different, as discussed in Appendix B. 

Contracting on an anchor p, even one that isn't a vector, corresponds 
to differentiating in the direction p. Thus, all of the standard formulas for 
differentiation carry over, including the product rule, 

(fg) ^p= (f^p)g + f(g^p), 

and the rule for perfect powers, 

f k ^P = kf k - 1 (f^p). 

These identities hold for any forms / and g, regardless of their degrees, 
and without even any requirement of homogeneity. But it is critical that 
p be an anchor, that is, a 1-site. Contracting on a 0-site, that is, on a 
real number b, is simply scalar multiplication; so we get the simpler rules 
(fg)\-b = f(g\-b) = (f\-b)g = bfg and f k \-b = bf k . Contracting on m-sites 
for m > 1 is more complicated, like differentiating m times; for example, if 
/ and g are forms and p and q are anchors, we can calculate that 



fgupq= (f^pq)g+ (f^p)(g^q) + (f^q)(g^p) + f(g^pq). 
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7.10 Derivations and differential operators 

Let G = 0 n>o G n be any graded algebra. A linear map 5: G — > G is called 
a derivation when it satisfies the product rule 

S(xy) = 5(x)y + x5(y), 

for all x and y in G. A derivation <5 is said to lower degree by 1 when 
5(G n ) C G n _i, that is, when 5 maps every element x that is homogeneous of 
degree n to an element 5 (x) that is homogeneous of degree n — 1 . 

We have been studying the algebra of forms Sym(A*), which is a graded 
algebra. For any anchor p over A, let 5 P : Sym(A*) — > Sym(A*) be the map 
that contracts on p, so that := / *-p. For any fixed anchor p over A, 

the map 5 P is a derivation that lowers degree by 1. 

It turns out that every derivation 5: Sjm(A*) — > Sym(A*) that lowers 
degree by 1 is of the form 5 — 8 P , for some anchor p over A. Here is why. Since 
5 lowers degree by 1, 5 must map 1-forms to real numbers; so 5 restricts to a 
linear functional on coanchors. But every such linear functional corresponds 
to pairing with some anchor. So there exists some anchor p with 5(h) = (h, p), 
for all coanchors h. Rephrasing this, we have 5(f) = f\-p = 5 p (f) for every 
1-form / on A. We also have 5(1) — 1 \-p — 5 P (1) = 0, since the only 
way that 5 can lower the degree of the 0-form 1 is by taking it to 0, the 
unique (— l)-form. (See also Exercise 7.10-1.) But every form / on A can 
be written as a linear combination of products of zero or more coanchors. 
The derivations 5 and 5 P agree on the empty product 1, they agree on all 
coanchors, they are both linear, and they both satisfy the product rule; so 
we can conclude that 5(f) = 5 p (f) for all forms /. 

Exercise 7.10-1 Let 5: G — > G be any derivation of an algebra G. Without 
any assumption about what the derivation 5 does to degrees, show that 
5(1) = 0. (Hint: Substitute x := y := 1 in the product rule.) 

Exercise 7.10-2 Define a map 5: Sjm(A*) — > Sym(A*) by setting 5(f) := 
nf, for every n-form /. Show that 5 is a derivation of the algebra of forms 
that leaves degree unchanged. 

Answer: For an n-form / and an m-form g, we have 5(fg) = (n + m)fg = 
nfg + mfg = 5(f) g + f5(g). 

So every derivation of the algebra of forms that lowers degree by 1 simply 
contracts on some anchor. Those derivations have the additional pleasant 
property that they all commute with each other. For any anchors p and 
q and any form /, we have (f \-p) \-q = (f \-q) i_p = ft. (pq); so we have 
5po5 q = 5 q o5 p . Indeed, the differential operators D p and D q actually commute 
with each other more generally; we have D p (D q (f)) = D q (D p (f)), not just 
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for the functions / in Poly (A, R), but at least for all real- valued functions 
/ : A — > R that are twice continuously differentiable. 

Since the derivations S p , for anchors p in A, commute with each other, we 
can use them to build up a commutative algebra: the algebra of all differential 
operators that can be expressed as polynomials in the derivations (<5 P ) pe ^- 
Note that two different polynomials in the variables {S p ) p& ^ may denote the 
same operator. For example, if E :— (Q + R + S)/3 is the centroid of a 
reference triangle AQRS in A, then 35 e = 5 3 e = 8q + 5r + 5s- Sound 
familiar? Indeed, this algebra of differential operators is simply the algebra 
of sites in disguise. Any site s on A gives us such a differential operator by 
contraction, by the rule / i— > / 1_ s. 

Thus, if we already understand the algebra of forms, one way to con- 
struct the algebra of sites is as a certain algebra of differential operators on 
forms. For example, suppose that A is an affine plane. Working in Cartesian 
coordinates, we could define an n-site over A to be a polynomial that is ho- 
mogeneous of degree n, not in the three anchors (C,<p,ip), but in the three 
derivations (d/dw,d/du,d/dv). We would then pair an n-site s with an 
n-form / by applying, to the form /, the differential operator that s denotes. 

I've never seen anyone do so, but it would make equal sense to treat sites 
as basic and to define forms as certain differential operators on sites, replacing 
the three coanchors (w,u,v) by the derivations (d/dC,d/d(p,d/dip). 

People who define sites to be differential operators on forms get the right 
answers, but they obscure the fundamental symmetry between forms and 
sites. Suppose that we have somehow defined the algebra of forms Sym(A*). 
Whatever technique we used to algebrize the linear space A* of coanchors 
would surely work, equally well, to algebrize the space A of anchors, thus 
producing the algebra of sites Sym(A). It seems more natural to produce 
forms and sites via the same technology, rather than to exploit differential 
operators to define one of them in terms of the other. 

It also seems strange, when talking about differential operators, to restrict 
ourselves to operators that are polynomials in the three derivations d/dw, 
d/du, and d/dv. Typically, when defining differential operators, we also 
allow multiplying by w, u, or v; for example, u(d/ du) is a common differential 
operator that preserves degree. Of course, differential operators of this more 
general type typically don't commute; for example, the operator u{d/du) 
first partials with respect to u and then multiplies by u, not the reverse. 

Exercise 7.10-3 People who define sites to be differential operators on 
forms are naturally led to one of the two possible pairings. Which one is 
it, the summed pairing or the averaged pairing? 

Answer: The summed pairing. For example, they compute the real num- 
ber (w n ,C n ) = (w n , (d / dw) n ) by applying the operator (d/dw) n to the 
n-form w n , getting (d/dw) n (w n ) = n\ , rather than 1. 
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Math remark: The derivations that we have defined, maps from an algebra 
to itself, are a special case. There is a more general notion of a derivation 
as a linear map 5: G — > H that satisfies the product rule, where G is a 
commutative algebra and H is a G-module. Generalizing the notion of a 
derivation in this way lets us construct, for any commutative algebra, a 
universal derivation of that algebra. For a concrete example, suppose that 
G = Sym(A*) = R[w,u, v] is a polynomial algebra in three variables. The 
universal derivation of G is the map d: G — > H defined by 

d (f) := 7^~ dw + 7T du + lf dv > 
aw ou ov 

where H is the free G-module with basis (dw,du,dv). Any derivation of G 
can then be achieved by substituting appropriate values for the three symbols 
dw, du, and dv. For example, if we substitute scalars p w , p u , and p v for dw, 
du, and dv, we get a derivation 5: G — > G that lowers degree by 1; in fact, 
we get the derivation S p associated with the anchor p = p w C + p u ip + p v ip = 
p w (d/dw) +p u (d/du) +p v (d/dv). For another example, if we substitute w, 
u, and v for dw, du, and dv, we get w(d/dw) + u(d/du) + v(d/dv), the 
degree-preserving derivation of Exercise 7.10-2. 



7.11 Agreement to k order 

An n-form / on A can be evaluated, not only at points in A, but at any 
anchor over A. As a consequence, / can also be differentiated, not only 
in the directions of vectors over A, but in the direction of any anchor over 
A. Evaluating a form at arbitrary anchors doesn't lead to confusion. But 
differentiating a form in the directions of anchors that aren't vectors leads to 
a subtlety that is worth discussing. 

By the way, these generalized flavors of evaluation and differentiation 
became available to us as soon as we homogenized. We started out, in the 
nested-spaces framework, with a polynomial function /: A — > R of degree 
at most n. In converting to the homogenized approach, we linearized the 
domain space A into A and we homogenized / into the n-form /: A — > R. 
Already at this point, it started making sense to use arbitrary anchors in 
evaluation, and hence also in differentiation. Thus, the subtlety that this 
section discusses has nothing to do with the algebra of sites. 

The subtlety involves the naive concept of "all possible derivatives". In 
some cases, what this turns out to mean, precisely, is the derivatives in all 
possible directions that are vectors — but not the derivatives in directions 
that are anchors of nonzero weight. 

For example, consider the notion of "agreement to k th order". Two 
smooth, real-valued functions / and g defined on A are said to agree to 
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k th order at a point P in A when 

D vl '--D v .f(P)=D vl ---D v .g(P), 

for all j in [0 . . k] and all vectors 7Ti through 7Tj over A. That is, all derivatives 
of / and g of order at most k agree at P. 

Suppose now that / is given by a polynomial of degree at most n, and 
let that same symbol / denote the resulting n-form; and similarly for g, an 
m-form. The identity above then reduces to the identity 

(/, tti • • • njP^/in - 3 )\) = (g,rr l ... njP^/im 

Might this identity hold with the vectors ~K\ through 7ij generalized to become 
arbitrary anchors? 

If n — m, then that generalized identity does hold. A more concise way 
to phrase the situation is as follows: Two n-forms / and g agree to k th order 
at P just when (/, s) = (g, s) for all n-sites s that are multiples of p n ~ k j as 
we essentially saw in Proposition 6.7-2. 

If k — 0 and hence j = 0, the generalized identity holds trivially, since 
there are no parameters 7Tj to remove restrictions from. 

But, if n and m are distinct and k > 1, there is no hope. Substitut- 
ing 7Ti := ••• := 7ij := P, we find that we must have (f,P n /(n — = 
(g, P m j (m — j) \ ) for all j from 0 to k, and that is possible only if both / and 
g are zero to k th order at P. Thus, when we require two forms of differing 
degrees to agree to some order at some point, we must restrict the directions 
of differentiation (jTj) to be vectors. 

Suppose that / is a fixed n-form and that we want to determine g to be 
the unique fc-form that agrees with / to k th order at P. How do we construct 
that unique g via the paired algebras? We must arrange that 

(7.11-1) (/, TTx • • • njP^/in -j)\) = {g,m--- KjPV-fi/ik 

for all j in [0 . . k] and all vectors 7Ti through ttj over A. Let (P, tpi, . . . , ipA be 
some Cartesian reference frame for the affine ci-space A that uses the point P 
as the center of its coordinate system. Every fc-site over A can be expanded 
as a linear combination of monomials of the form P k ~~j(p a , where j is in [0. . k] 
and a = (a±, . . . , a^) is a multi-index with \a\ := a± + ■ ■ ■ + = j. For any 
monomial fc-site of the form P k ~i<p a , Equation 7.11-1 tells us the value that 
we must assign to (g, P k ~^ip a ) . Assigning arbitrary values to those pairings 
determines a unique fc-form g, since those monomials form a basis for the 
space Sym A .(A) of fc-sites. And the fc-form g that is so determined will, in 
fact, satisfy Equation 7.11-1 for all vectors 7Ti through iij, since each 7Tj is a 
linear combination of (ipi, . . . , ipd)- 

In Section 8.4.2, we shall analyze a differencing algorithm for computing 
n th derivatives of an n-form, when that n-form is given to us by its values 
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at the points of an evenly n-divided <i-simplex. That algorithm is another 
example where we must restrict the directions of differentiation to be vectors. 



Chapter 8 

Exploiting the Pairing 



We have built the algebra of sites, in parallel with the algebra of forms; and 
we have chosen, for each n, a pairing map between n-sites and n-forms. So 
each dual functional on n-forms is now represented, for us, by an n-site. 
Symmetrically, each dual functional on n-sites is represented by an n-form. 
In this chapter, we study several ways in which those representations clarify 
and simplify CAGD. 

8.1 The duals of popular monomial bases 

Several of the most popular bases for the linear space Sym n (A*) of n-forms 
on A are monomial bases. For example, a power basis is the monomial basis 
associated with a Cartesian reference frame for A, while a Bernstein basis 
is a rescaling of the monomial basis associated with a barycentric reference 
frame. In the paired-algebras framework, Proposition 7.4-2 tells us that the 
duals of these popular bases are also rescalings of monomial bases. 

8.1.1 Power-basis forms and Taylor-basis sites 

Let A be an affine space of finite dimension d, and let (C, <f±, . . . , if a) be 
a Cartesian reference frame for A. The point C together with the vectors 
ipi through (pa form a basis (C, tpi, . . . , ipd) for the linearized space A. Let 
(w , ui, . . . , Ud) be the dual basis for A*. The monomials of total degree n in 
the variables (w, ui, . . . , Ud) form the power basis for n-forms on A associated 
with this reference frame. To denote those monomials, let a :— (aio, . . . , ad) 
be a multi-index with |a| = n, and let a + denote the dehomogenized multi- 
index a + := («!,..., a d ), so that a 0 + \a + \ = \a\ = n. The power basis 
consists of the n-forms (w a ° u a+ )\ a \ =n . 

We now apply Proposition 7.4-2. Since we have adopted the summed 
pairing, we conclude that the dual basis for n-sites is the rescaled monomial 
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basis (C a ° (p a+ /a\ )\ a \= n - We shall refer to this basis as the Taylor basis 
for n-sites associated with the reference frame (C, </?!,..., <p d ), since pairing 
an n-form with the n-sites in this Taylor basis (C a ° (p a+ /a\ )\ a \= n precisely 
corresponds to expanding that n-form in a Taylor series around C. 

Take the case d = 2 and n = 3, for a concrete example. Here, listed on 
successive lines, are the power-basis cubic forms, the Taylor-basis cubic sites, 
and the results of pairing those cubic sites with an arbitrary cubic form /: 

w 3 w 2 u w 2 v wu 2 wuv wv 2 u 3 u 2 v uv 2 v 3 

6 2 2 2 l^iptp 2 6 2 2 6 

t<r\ n t<r\ n tin n n (D ^ 2 ^ (^) 3 / {EzlBkl D AD^) 2 f (A^) 3 / 

/(OJ D V J(C) V^f(G) 2 L> V U^}{U) y g 2 2 6 

Note that all three of the corner sites in this example represent evaluations. 
The site C 3 /6 = ec represents evaluation at the center point C, clearly. But 
the site ip 3 /6 = e v also represents evaluation — evaluation at the vector ip; 
we have f(<p) = (f,^ 3 /6) = (D^f/6. 

8.1.2 Bernstein-basis forms and Bezier-basis sites 

Let's consider Bernstein bases for n-forms next. Let (Rq, . . . , Rd) be a 
barycentric reference frame for the <i-dimensional affine space A. And let 
(r 0 , . . . , rA be the basis for A* that is dual to the basis (Rq, . . . , Rd) for A. 
The Bernstein basis for n-forms on A associated with the reference frame 
(Ro, . . . , Rd) (or with the reference ci-simplex [R 0 , . . . , R d }) consists of the 
n-forms ((") r a )^_ n . The multinomial scaling factor (™) makes the Bern- 
stein n-forms a partition of unity; that is, we have 

£ ( n \ a (P)= ( n \o(Pr---r d (PT* 

\ a \=n \a\=n ^° ' 

= ( ro (P) + ... + r d (P)) n = r = l, 

for all points P in A. 

What basis is dual to the Bernstein basis? In traditional approaches 
to CAGD, that dual basis consisted of certain dual functionals (p a )\ a \=n- 
The d + 1 functionals at the corners, from p( n ,o,...,o) through p(o,...,o,n), were 
recognized as being the point evaluations €r 0 through €R d . But the remaining 
dual functionals were not typically viewed as having any simple form. 

The paired-algebras framework lets us represent every one of those dual 
functionals quite simply, as a monomial in the points (Rq, . . . , Rd)- By Propo- 
sition 7.4-2, the basis dual to the Bernstein basis consists of the n-sites 
(R a /n\ )| a | =n . Note that the corner n-sites are again point evaluations, from 
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e^ 0 = Rq/uI through e# d = R!}/n\ . The internal n-sites represent evaluations 
of the blossom; we have 

(/) R a /n\ ) = f( Rp, . . . , Rp , ■ . . , Rd, ■ ■ ■ , Rd) - 

Indeed, for any n-form /, the pairing values (f,R a /n\) are precisely the 
Bezier ordinates of /, the coefficients that are needed to expand / as a linear 
combination of the Bernstein n- forms. As a result, it seems natural to refer 
to the basis (R a /n\ )\ a \= n as the Bezier basis for n-sites that is associated 
with the barycentric reference frame (R 0 , . . . , Rd)- 



8.2 The de Casteljau Algorithm 

The de Casteljau Algorithm can be thought of in various ways. From a 
blossoming perspective, it starts with the Bezier ordinates of /, the blossom 
values 




"0 



for | a | = n and, by taking repeated linear combinations, it computes an 
arbitrary blossom value f(p±, . . . ,p n ), where p\ through p n are any anchors 
over A. Now that we understand about n-sites, we can avoid all mention of 
the n-form / as follows: The de Casteljau Algorithm computes an arbitrary 
real-lineal n-site pi ■ ■ ■ p n /n\ as a linear combination of the Bezier n-sites 
(R a /n\) H=n . 

Here's how the de Casteljau Algorithm works. Let e,, for % in [0 . . d], 
denote the multi-index that has a one in the i th place and zeros everywhere 
else, so that a = a ■ e = «oeo + • • • + cxd^d- The de Casteljau Algorithm 
computes an n-site over A that we shall denote I a , for all |a| < n: 

for \a\ = n do I a := R a /n\ od; 
for k from 1 to n do 

for | or | = n — k do 

I a '■= r 0 (pk)I a +e 0 H h r d (p k )I a+ed 

od; 

od; 

output /(o,.. .,o) = Pi ■ ••Pn/n\ 

This works because, for all a with |a| = n — k, we have I a — p\ • ■ -pkR a /n\ . 
The first statement establishes this invariant for k — 0. To analyze the 
assignment in the inner loop inductively, consider the barycentric expansion 
P — r o{p)Ro + • • - r d{p)Rd) which is valid for all anchors p over A. Setting 
p := pk in that expansion and then multiplying by p\ • • - pk~iR a /n\ shows 
that I a is set correctly. 
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8.3 Degree raising 

Given the Bezier points of a polynomial curve or surface of degree at most 
n, there are well-known rules for computing the Bezier points of that same 
curve or surface, viewed as being of degree at most n + 1. The process is 
called degree raising (a.k.a. degree elevation). As an exercise in the use of 
the paired algebras, let's rederive those well-known rules. 

Each coordinate of the original curve or surface is a real- valued function 
on the parameter space A, and homogenization converts each of them into 
an n-form on A. Let / be one of those n-forms. If we homogenized up to 
degree n + 1, rather than to degree n, the result would be wf, where w is 
the weight coanchor on A. This suggests that degree raising corresponds to 
multiplication by w. Indeed, since any point P in A has w(P) = 1, we have 
(wf)(P) = w(P)f(P) = f(P); so the n-form / and the (n + l)-form wf agree 
at all points. Note that they don't agree at arbitrary anchors, however. 

So our task is to compute the Bezier ordinates of wf in terms of those of /. 
Letting [R 0 , . . . , R d ] be a reference d-simplex in A, the Bezier ordinates of / 
are the values (/, R a /n\ ), where a varies over all multi-indices with \a\ — n. 
Similarly, the Bezier ordinates of wf are the values (wf ', R 13 / '(n + 1)!), for 
\f3 1 = n + 1. We can compute the latter in terms of the former by using 
the less popular contraction operator the one that contracts a site on a 
form to produce a site of lower degree. 

From the definition of contraction, we have 

(wf,R?/(n + \)\) = (/.tuj^/ffi + l)!). 

Contracting on a coanchor obeys the product rule, just like contracting on 
an anchor, which is differentiation. Since w(Rq) = ■ ■ ■ = w(Rd) = 1, we have 
w —i R^ = (3 0 RP- eo + ■■■ + fi d R P ~ ed - Thus, for any (3 with \(3\=n+ 1, we get 
the familiar formula for degree raising 

(8.3-1) 

(wf, R p /{n + 1)! ) = (f,w _. B?/{n + 1)! ) 

= (/, (p 0 R p - eo + ■■■ + P d R p - ed )/{n + 1)! ) 

= ^(/, R"~ eo /n\ ) + .-. + -^(f, R^/n\ ). 

If 0i — 0 for some i, then the exponent (3 — will have a negative entry, 
which might seem like trouble; while the paired-algebras framework does 
allow us to multiply by points, it doesn't allow us to divide by them. (Though 
it could; see Section 8.5.) But any such troublesome term is multiplied by 
fa = 0, and hence drops out of the sum. This is just like differentiating the 
constant 1 via the power-law, where we have D w l = D n g° = 0g~ 1 D 7T g = 0 
for any g, whether or not g is invertible. 
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Formula 8.3-1 expresses the (3 Bezier ordinate of wf as an afflne combi- 
nation of d+ 1 of the Bezier ordinates of /. This process actually has nothing 
to do with the particular n-form /. Here is that same formula, rewritten to 
express the w-contraction of the Bezier (n + l)-site R^/(n + 1)! as an affine 
combination of d + 1 Bezier n-sites: 

W ~ l (n+l)\ n + l\ n! ) n + 1 \ n\ J' 

8.4 The geometry of point evaluations 

Representing dual functionals on n-forms as n-sites is helpful also when 
studying those dual functionals that evaluate at points. Let ep be the dual 
functional that evaluates n-forms at the point P, so that e P (f) := f(P). 
In traditional approaches to CAGD, this functional is just one element of a 
space of linear functionals — rather abstract and disembodied. But we have 
forged the connection f(P) = (f, P n /n\ ); so the functional ep is represented, 
in the paired- algebras framework, by the n-site ep = P n /n\ . Why is such a 
concrete, algebraic formula possible and what does it buy us? 

Such a formula is possible because the Veronese map 8d, n , the map that 
takes each point P in the <i-space A to its n th power 6d, n {P) '■= P n , turns 
out to encapsulate precisely the nonlinear stuff that has to happen as part 
of evaluating an n-form. Evaluating an n-form / at a point P is not a linear 
process; while the value f(P) is a linear function of /, it is a nonlinear 
function of P. But - - and here is the key insight on which the paired- 
algebras framework is built — the value f(P) varies linearly as a function 
of P n . We can evaluate any n-form / at the point P by plugging / and P n 
into the bilinear map (/, s) i— > (f,s/n\). Thus, raising P to the n th power 
precomputes precisely enough information about P so that the subsequent 
evaluation of any n-form at P is a linear process. 

Math remark: Raising P to the n th power is linearly equivalent to evaluating 
n-forms at P only over fields of characteristic zero. Over a field of prime 
characteristic, raising P to the n th power may not give us enough information 
to evaluate an arbitrary n-form at P. The problem shows up already for 
functions of a single variable, say f(i) = f(C + tip). Raising i to the n th 
power means computing the coefficients in the binomial expansion 

t n = {c + t v ) n = ( n ) ** c n ~y ■ 

In characteristic zero, knowing the values ((") t*)o<j< re is linearly equivalent 
to knowing the values (t l )o<i<n, the latter being what we need to evaluate 
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any n-form / at t. But in prime characteristic, we may have (™) = 0 for 
some % in the range 0 < % < n, so we may not know enough. The fact that 
the evaluate-at-P functional e P is represented by the n-site P n /n\ serves as 
a warning that we would be in trouble were n! to be 0. 

Once we have the formula ep = P n /n\ , we can use that correspondence 
as a source of insight into the geometry of the point-evaluation functionals. 
Indeed, given any points Pi through P m in A, the geometric relationships 
that hold among the functionals ep i are precisely those that hold among the 
corresponding n-sites P"/n! . And since most geometric relationships aren't 
affected by a uniform scaling, we can often drop the annoying n! and consider 
simply the n th powers P™ through P^. 

For example, let d :— dim(A) and let m := ( n ~^ d ) denote the dimension 
of the space Sym n (A*) of n-forms on A. An n-form on A then involves m 
degrees of freedom. So we might hope that we could specify an n-form / 
on A by requiring that / interpolate arbitrary specified values at m fixed 
points, say (Pi, . . . , P m ). Whether that scheme works or not depends upon 
the geometric structure of the points (Pi)- The points (Pi, . . . , P rn ) are called 
good for interpolation by d-variate n-ics when specifying arbitrary real values 
for /(Pi) through /(P m ) determines a unique n-form /. (Since the n-form 
/ has been homogenized, it is actually a polynomial in d + 1 variables; but 
it is still referred to as rf-variate.) In the univariate case d — 1, the points 
(Pi, . . . , P n +i) are good for interpolation by n-ics whenever they are distinct, 
as follows from the Vandermonde determinant. But the multivariate case is 
more subtle. Note, for example, that we certainly can't allow more than n + 1 
of our m = ( n ^ d ) points (p) to be collinear, since the n-form restricted to 
that line is a univariate n-ic. 

From basic linear algebra, the points (Pi, . . . , P m ) will be good for inter- 
polation by n-ics just when the point-evaluation functionals (ep 1 , . . . , ep m ) are 
linearly independent, hence forming a basis for the dual space Sjm n (A*)* . 
Now that we have the correspondence ep = P n /n\ , we can make this criterion 
more primal and concrete: The points (Pi, . . . , P m ) are good for interpolation 
by n-ics just when the n-sites (P™, . . . , P£) are linearly independent, hence 
forming a basis for the space Sym n (A) of n-sites over A. 

8.4.1 Evenly n-divided d-simplices 

One standard example of a configuration of points that is good for interpo- 
lation by ci-variate n-ics is the vertices of an evenly n-divided ci-simplex. In 
this section, we introduce that configuration of points. For completeness, 
we also verify that those points are indeed good for interpolation by n-ics. 
Figure 8.1 shows an evenly 4-divided 2-simplex. 
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•Ro 



'R 2 

Figure 8.1: An evenly 4-divided 2-simplex 

Let (Ro, . . . , Rd) be a barycentric reference frame for a rf-dimensional 
afflne space A. The points Ro through Rd are then the vertices of a (/-simplex 
[Ro, . . . , Rd] in A. We want to subdivide that simplex evenly into subsimplices 
whose linear dimensions are n times smaller. To do that, we let (3 ■ R denote 
the linear combination (3 • R :— (3qRo + • • • + PdRd, where (3 = ((3 0 , . . . , (3d) 
is a multi-index. Since Ro through Rd are all points, the dot product (3 ■ R 
is an anchor over A of weight \(3\. Consider the set of points ((3 ■ R/n)\p\ =n . 
We shall say that those m := ( n+d ) points evenly n-divide the rf-simplex 
[Ro, . . . , Rd]- Note that the edge from Ri to Rj is divided into n segments of 
equal length by the points 

(n - l)Rj + Rj (n - 2)R t + 2Rj 
it i, , , . . . , it j. 

n n 

By the way, we must restrict n to be positive when evenly n-dividing 
a rf-simplex, since we are dividing by n to get points. The theory would 
perhaps be cleaner if we dealt directly with the weight-n anchors ((3 ■ R)\m =n . 
We might refer to those anchors as evenly n-replicating the rf-simplex. An 
evenly 0-replicated rf-simplex would consist of the single anchor 0, for any 
dimension d (vacant remark: including d = —1). 

Proposition 8.4-1 Let (R 0 , ■ ■ ■ , Rd) he a barycentric reference frame for the 
d-dimensional afRne space A. For any positive n, the points ((3 • R/n)\p\ =n 
that result from evenly n-dividing the d-simplex [R 0 , . . . , Rd] are good for 
interpolation by n-ics on A. 

Proof We must show that the evaluation functionals (€p.R/ n )\p\ =n form a 
basis for the dual space Sym n (A*)* or, equivalently, letting sp denote the 
n-site 

1 f(3-RX 

sp :-- 



nl \ n 



that the n-sites (sp)\p\ =n form a basis for the space Sym n (A). We'll use 
the latter language for practice, even though this proof doesn't exploit the 
multiplication in the algebra of sites. 
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How can we prove that the particular family of n-sites (sp)\p\ =n forms a 
basis? Let (f a )\a\=n be some family of n-forms on A. Letting m := 
we can construct an m-hj-m matrix whose (a, (3) entry is (f a ,sp). If that 
matrix is invertible, it follows that both the n-forms (f a ) and the n-sites (sp) 
must constitute bases. So it suffices to construct some family of n-forms (f a ) 
which, with the n-sites (sp), generate an invertible matrix. 

We effect that construction in a nonsymmetric way, letting the vertex 
Ro play a special role. For i from 1 to d, let ipi := (Ri — Ro)/n be the 
vector that separates two adjacent subdivision points along the edge joining 
Ro to Ri. The sequence (Ro, tpi, ■ ■ ■ , tp d ) constitutes a Cartesian reference 
frame for the space A. In that reference frame, the point (3 ■ R/n can be 
rewritten as Ro + (j3 + ■ tp), where (3 + := (fa, . . . , (3d) denotes the multi-index 
(3 with its zeroth component fa removed. Let (w, ui, . . . , u,i) be the basis 
for the space A* of coanchors that is dual to the basis (R 0 ,tpi, . . . ,tpd) for 
A. From the duality constraints, we deduce that (w,Ro + (f3 + ■ tp)) = 1, 
while (ui, Ro + ((3+ ■ tp)) = fa. It follows that, for any integer k, we have 
(ui — kw, R 0 + ((3+ ■ tp)) = (ui — kw) (i? 0 + ((3+ • tp)) = Pi — k. This value is 
zero, of course, precisely when k = (3i. 

To generate lots of such zeros, we use falling-factorial powers. For any % 
in [1 . . d] and any nonnegative k, we define the /c-form uj by 

uj := Ui(ui — w)(ui — 2w) ■ ■ ■ (ui — (k — l)w). 

These are analogs of the falling-factorial powers in combinatorics [24] , but 
homogenized. Note that uj [Ro + ((3+ ■ tp)) will be zero just when k > fa, 
because of the factor of Wj — (3iW in uj. 

For any multi-index a, we denote by u— the product u— := uy~- ■ -u^r. 
We then choose our family of n-forms (/ Q )| a |= n to be 

f a := w ao u^± = w ao uy- ■ ■ ■ uf. 

Pairing any n-form / with the n-site sp corresponds to evaluating / at 
the subdivision point Ro + (/3+ ■ tp). So, if ctj > fa for any i from 1 to d, 
we conclude that (f a , sp) = f a (Ro + {(3+ • tp)) = 0. On the other hand, the 
diagonal entry (f a ,s a ) definitely won't be zero. 

We now order both the rows and columns of our m-bj-m matrix using 
any common total ordering -< of the multi-indices with the property that 
«o > A) implies a -< fa, that is, larger values of ao get listed first. It follows 
that, whenever a >z fa we have ao < fa- So either a = (3 or else we have 
CKj > fa, for some i in [1 . . d\. For example, Figure 8.2 shows the matrix that 
arises for bivariate cubics, under a certain ordering, using the abbreviations 
(C,tp,ip) := (Ro, tpi, tp 2 ) and (w,u,v) := (w,Ui,u 2 ). The matrix that results 
from any such ordering will be upper-triangular with nonzero entries on the 
diagonal, and will hence be invertible. □ 
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Figure 8.2: Proving Proposition 8.4-1 for bivariate cubics 



Vacant remark: The proof of Proposition 8.4-1 used a Cartesian reference 
frame, so it required d > 0. But the result holds also for d = —1, trivially 
since n is positive. If we were working with anchors instead of points, the 
analogous result would hold as well for evenly O-replicated (— l)-simplices, 
and the proof would be only slightly less trivial. 

8.4.2 The Differencing Algorithm 

Suppose that we have specified an n-form / on A by choosing the values of 
/ at the points that result from evenly n-dividing the (i-simplex [Rq, . . . , Rd], 
that is, by choosing the values (/(ct • R/nj) Proposition 8.4-1 tells us 
that / is uniquely determined by those values, so we can compute anything 
that we like about /. There turns out to be a surprisingly easy way to 
compute n th derivatives of /. That is, the computation itself is easy; we do n 
stages of differencing, quite similar to the n stages of linear combinations in 
the de Casteljau Algorithm. But the reason why this Differencing Algorithm 
works is more subtle. In this section, as another example of the benefits 
of the paired-algebras framework, we verify this Differencing Algorithm by 
exploiting the multiplication in the algebra of sites. 

Rephrasing this without mentioning the n-form /, Proposition 8.4-1 says 
that the n-sites {[at-R/n) n jn\ ) , , form a basis for the linear space Sym n (A) 
of n-sites. So we can expand any n-site as a linear combination of those basis 
elements. The Differencing Algorithm is a particularly simple way to expand 
certain n-sites: the products tt\ ■ ■ ■ n n of n vectors over A. 
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The Differencing Algorithm is quite similar to the de Casteljau Algorithm 
in computational structure — surprisingly similar, given that the two algo- 
rithms take quite different inputs. The de Casteljau Algorithm starts with 
the sites in a barycentric monomial basis, while the Differencing Algorithm 
starts with the n th powers of the points in an evenly n-divided rf-simplex: 

for |a| = n do J a := (a ■ R/n) n /n\ od; 
for k from 1 to n do 

for |a| = n — k do 

J a := n(r 0 (rr k )J a+eo H h r d (-K k ) J a +e d ) 

od; 

od; 

Output J( 0 ,...,0) — 7Ti • • • 7T n 

Recall that e^, for % in [0 . . d], denotes the multi-index that has a one in the 
i th place and zeros elsewhere. The factor of n in the inner loop is justified 
as follows. The numbers (r 0 (vTfc), . . . , r^k)) are the barycentric coordinates 
of the vector 7r fc in the reference frame (R 0 , . . . , Rd)- But we want to take 
differences with respect to one of the small simplices into which that large 
simplex is divided. Viewed with respect to one of those small simplices, the 
vector 7Tfc looks n times longer. 

We verified the de Casteljau Algorithm quite easily, but several things 
indicate that the Differencing Algorithm is more subtle. For one thing, the 
Differencing Algorithm requires 7Ti through ir n to be vectors, that is, to have 
barycentric coordinates that sum to zero. If we try to use the Differencing 
Algorithm to compute an n-site p 1 • • -p n whose factors are not vectors, it gets 
the wrong answer. For another thing, the factor of n! that divides the input 
sites has mysteriously disappeared in the output site; the result is 7Ti • • -7r n , 
with no n\ in the denominator. 

If we substitute the definitions of the sites (J a ) computed earlier into 
the formulas for those computed later, we can capture the correctness of the 
Differencing Algorithm as the following hoped-for algebraic identity: 



r h (Tri)r i2 tt 2 ) • • T in (7r n ) ■ 



7Ti • • • 7T n 



0<h<d 
0<i n <d 

Note that (e^ + h e in ) • R — R^ + • • • + R in . Note also that the factors 

of n in the inner-loop differences, nested n levels deep, cancel against the n n 
in the denominators of the input sites. 

We shall generalize that identity slightly and then prove it by induction 
on n. The generalization reflects the fact that n th differences of polynomials 
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of degree less than n are zero: 
(8.4-2) 

E r»w ■ ■ ■ r„M (R " + - ;, +fl - > "' = f * ■ • w '; en "* = " ^ 

' n! 0 when 0 < m < n. 

0<ii<d ^ — 

0<i n <d 

The base case of the induction is the trivial case n = m = 0. The sum on 
the left has (d + 1)" = (d + 1)° = 1 term, that single term being 0 m /n! = 1. 
The empty product on the right is also 1. 

So assume that Equation 8.4-2 holds up to n and consider what happens 
for some m with 0 < m < n + 1. We start off by pulling the sum over i n+ \ 
outside and then applying the Binomial Theorem: 

(R h H h Ri n+1 ) m 



E ^(^l-^ilVi)- 



0<ii<d 



0<i n +i<d 



(n + 1)! 



i (fn+i) > , r il (7ri)---r in (7r n ) 



0<i„ + i<d 0<h<d 
0<i„<d 



{{Rn + . . . + Rin) + Rtn+i) m 

(n+1)! 



0<i n+ i<ci 0<fc<rrA ' 0<ii<(i ^ 

0<i„<(2 

The innermost of these three nested sums is zero by induction when k < n; 
it doesn't matter that the denominator is (n + 1)! instead of n! . So we can 
raise the lower bound in the sum on k from 0 to n. We can also lower the 
upper bound from m to m — 1, as follows. When k = m, the factor R™~* 
drops out, leaving nothing that depends upon i n+ ±. So the terms with k = m 
contribute some constant multiple of the outer sum ro(7r„ + i) + • • ■ + rd(7T n +i)- 
But that sum is zero, since n n+ i is a vector. 

Thus, we can tighten the bounds in the sum on k from 0 < k < m to 
n < k < m. So nothing at all remains when m < n + 1, as we had hoped. 
When m — n + 1, the single value k = n gives us 

E r i n+ i(^n+i)(n+ l)Ri n+1 E ^1(^1) • -^inM — I jrp^ • 

0<«„<d 

The n+1 that came from the (™) converts the (n + 1)! in the denominator 
to an n\ , after which the inductive hypothesis replaces the inner sum with 
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7Ti • • • 7r„, leaving us — again as we had hoped — with 

( ri "+i ( n n+l)Ri n+1 j TTi • ' ' 7T n = 7Ti • • • 7T n+ i. 

8.5 Integrating over a simplex 

We have talked a lot about multiplying by points; what about dividing by 
them? Multiplying by points makes sense because we have extended the 
affine space A of points into the algebra Sym(A) of sites. If we wanted to, 
we could further extend that algebra into a field: the field Quo(Sym(A)) 
of quotients whose numerators and denominators are sites. We would then 
need yet another new term; for example, we might refer to a quotient of 
sites over A as a location over A. I don't yet see many applications for 
locations in CAGD, so I don't yet recommend that we in CAGD take this 
additional step, from sites to locations. But locations do have at least one 
intriguing application; this section discusses a formula for integrating real- 
valued functions over simplices that can be expressed more simply using 
locations than using sites. Perhaps, when enough other applications have 
been discovered, it will be time to take this further (and final?) step toward 
allowing arithmetic on points: from points to anchors to sites to locations. 

Some words about the mathematics of locations: Let (C, (p 1: . . . , (p d ) be, 
say, a Cartesian reference frame for the <i-space A, and let (w, ui, . . . , uA 
be the basis for the linear space A* of coanchors that is dual to that frame. 
So the algebras Sym(A) and Sym(A*) of sites and forms are isomorphic to 
the polynomial algebras R[C, ip±, . . . , ipd] and R[iu, u±, . . . , u^]. Both of those 
algebras are free of zero divisors; that is, for any two sites s and t over A, 
we have st = 0 only when either s = 0 or t = 0, and similarly for forms 
on A. Thus, each of those algebras, when viewed as a ring, is an integral 
domain (a.k.a. is entire). So each of those algebras has a quotient field. The 
quotient field Quo(Sym(A*)) of the algebra of forms is isomorphic to the field 
of rational functions R(w, ui, . . . , u d ) in the d+ 1 variables w and U\ through 
lid] we allow ourselves to divide by any form that is not identically zero. 
Alternatively, we can exploit duality to think of Quo(Sym(A*)) as the field 
Rat (A, R) of real- valued, rational functions on anchors. Locations over A 
are elements of the quotient field Quo(Sym(A)) of the algebra of sites, where 
we allow ourselves to divide by any site that is not identically zero. The 
field Quo(Sym(A)) of locations over A is isomorphic to the field of rational 
functions R(C, <fi, . . . ,(pd)] or we can exploit duality to think of it as the 
field Rat(A*,R) of real- valued, rational functions on coanchors. 

Now, about that integration formula: Let [Rq , . . . , Rd] be a reference 
rf-simplex for the affine space A. Recall that the Bezier ordinates of an 
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n-form / on A are the real numbers (/, R a /n\ ) for \a\ — n, the ( n + d ) scalars 
that result from pairing / with each of the Bezier n-sites. Given some notion 
of volume in the space A, say a measure ji on A, suppose that we want to 
integrate the n-form / : A — > R with respect to u over the reference simplex 
[Ro, . . . ,Rd]- Lasserre and Avrachenkov [37] give a pretty formula for this 
integral, based on the observation that all of the Bezier ordinates contribute 
to the integral with equal weight. Thus, the integral is the volume of the 
reference simplex times the average of the Bezier ordinates: 

(8.5-1) / f(p) dn(p) = m °; n ; d : Rd]) E (/. Ra / nl >• 

J[R°,-,Ri] { n ) \ a \=n 

We won't bother to prove Formula 8.5-1 here, our goal being instead to 
use locations to simplify the sum on the right-hand side. One proof starts 
by showing that the average of the Bezier ordinates is not affected when the 
degree of / is raised. After raising the degree of / quite high, the many Bezier 
ordinates that result closely approximate the values of /, so their average 
becomes essentially a Riemann sum for the integral. Farin [20] sketches that 
proof in the univariate case, and it works equally well in higher dimensions. 

Both sides of Formula 8.5-1 are linear functions of the n-form /, so each 
must correspond to pairing / with some n-site. Recalling that f(P) = 
(f, P n /n\ ), we have 

So Formula 8.5-1 boils down to this relationship among the n-sites over A: 

(8.5-2) / P n d,( P ) = ^ [R °:; d ; Rd]) J2R". 

J[Ro,-,Rd] { n ) \ a \=n 

Consider the special case d — 1, where we integrate over a line segment 
[Ro ■ ■ Ri\] and let's write that segment as [R . . S] for simplicity. We get 

/ P n d/i(P) = ^ R " S ^ (R n + R n - 1 S + --- + S n ). 
J { r..s] n + 1 

If this were elementary calculus, we wouldn't distinguish between a point 
on the domain line and a real number. We could then express the length 
fi([R . . S]) simply as (S — R), which would help the sum on the right to 
collapse, leading to the elementary formula 
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In the paired-algebras framework, however, that equation is nonsense, since 
it alleges that an n-site equals an (n + l)-site. To avoid such nonsense, we 
must distinguish between the real number /i([R . . S]) and the vector S — R. 

But let's suppose that we have defined locations over A, thereby making 
it legal to divide by nonzero sites, as well as to multiply by them. We can 
then achieve much the same collapsing in a legitimate manner as follows, 
multiplying and dividing by the nonzero vector S — R: 

I yni,(P\ V([ R -- S }) (S-R)(R n + R^S+--- + S n ) 
V.S] M ] - n + 1 S-R 
fi([R..S]) S n+1 -R n+1 



n+1 S-R 

What happens when the dimension d exceeds 1? Can we still use division 
by sites to achieve analogous collapsings? The following identity for bivariate 
cubics with reference triangle AQRS points the way: 

Q 3 

+Q 2 R + Q 2 S 
+QR 2 + QRS + QS 2 
+R 3 +R 2 S +RS 2 +S 3 = - - + 



(Q-R)(Q-S) (R-Q)(R-S) 
S 3+2 

+ (S-Q)(S-RY 
We can make the univariate case fit that pattern by a little rewriting: 

r M M ' n+1 { R -S + S-R 



More generally, for any dimension d and degree n, we shall prove that 

_ TDn+d 

(8.5-3) £^=£ 



\a\=n 0<k<d 



n (Rk-R 



0<j<d 



Equation 8.5-3 is a algebraic identity; as we shall prove in a moment, it 
holds whenever the symbols R 0 through R d denote distinct elements of some 
field. By substituting the right-hand side for the left in Equation 8.5-2, we 
get a new integration formula, using locations, that is arguably simpler: 



(8-5-4) / P'dlAP) = m ''-: Rd]) E -WTjf 



n+d 



— — 0<j<d 
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The good news about this new Formula 8.5-4 is that the sum on the right- 
hand side has collapsed, leaving a sum with only d + 1 terms, instead of 
( n -f d ). eac k ^- erm j g more complicated; rather than summing n-sites, we 
are summing locations over A whose numerators are (n + d)-sites and whose 
denominators are (i-sites. The bad news about Equation 8.5-4 is that I don't 
know how to pair an n-form with such a location. Thus, it may well be that 
Formula 8.5-4 is useless for actually computing integrals, serving only as an 
intriguing, location-based way to abbreviate Formula 8.5-2. 

It remains to prove Equation 8.5-3, which we shall do by a joint induction 
on n and d. When d — 0, Equation 8.5-3 reduces to the trivial identity 
Rq = Rq. When n — 0, the left-hand sum has an empty product as its single 
term, so the left-hand side reduces to 1. To handle the right-hand side in 
the case n — 0, suppose that we use the Lagrange Interpolation Formula 
to interpolate the univariate polynomial t i— > t d at the d + 1 distinct real 
points t := r 0 through t := r^. The Lagrange interpolant will reconstruct the 
polynomial t d exactly, so we find that 

4 n t-rj 

0<j<d 
4-d _ ^ k 



0<j<d 
3+k 



Extracting the coefficients of t d from each term in this polynomial identity, 
we conclude that 



(8.5-5) 1= 



o /»■ <i n JX, rfe Tj 

— — 0<j<a 

for all sequences (tq, . . . , r<j) of distinct real numbers. If we multiplied this 
Equation 8.5-5 through by the least common denominator Ylo<i<j<d(. r j~ r i) °f 
the terms in the sum, however, we would be left with a polynomial identity. 
So Equation 8.5-5 must also hold with (r 0 , . . . ,r d ) replaced by any d + 1 
distinct elements of any field — in particular, by the points (i?o, • • • , Rd) in 
the affine space A. This establishes Equation 8.5-3 in the case n — 0. 

Suppose now that both n and d are positive, and let's rewrite the left-hand 
side of Equation 8.5-3 in the equivalent form 

S(d,n):=J2R a = E R n'-- R in- 

\a\=n 0<h<---<i n <d 

Partitioning this sum according as i n = d or not, we have 



S(d, n) = Rd S(d, n — 1) + S(d — l,n). 
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Applying the inductive hypothesis in each case, we find that 



cm— 1+d iin+d— 1 



— — 0<j<a — — 0<j<a— 1 



We multiply both the numerator and denominator of the right-hand sum- 
mand by — Rd, to get 

pn- 1+d vn+d—lrjD p \ 

S(d,n)-R d ^ _ + ^ n ' 

— — 0<j<a — — 0<]<d 

j^k j^k 

We can now raise the upper limit on k in the right-hand sum from d — 1 to d, 
since R^ — R d = 0 when k = d. We then expand the right-hand sum into two 
sums, the second of which cancels the left-hand sum, and we are left with 



S ^= E mi by 

which completes the proof by induction. 



Exercise 8.5-6 Prove Equation 8.5-3 without using induction by combining 
the ideas in our proof of the case n = 0 with Leibniz's Formula [13] from the 
theory of divided differences. 

Hint: Find the univariate polynomial of degree at most d that interpolates 
the polynomial t i— > t n+d at the points t := r 0 through t := r d . Using the 
Lagrange Interpolation Formula as in our proof of the base case n — 0, 
show that the right-hand side of Equation 8.5-3 gives the coefficient of t d 
in that interpolant. But that same coefficient is also the divided difference 
[r 0 , . . . ,r d ]t n+d . Show that the left-hand side of Equation 8.5-3 gives that 
divided difference by using Leibniz's Formula repeatedly, expanding t n+d as 
the product of n + d copies of t. 



Chapter 9 

Universal Mapping Conditions 



People who use the homogenized framework already have some familiarity 
with linearization, the process that extends an affine space A into the lin- 
ear space A. The paired-algebras framework also relies on algebrization, the 
process that extends a linear space X into the symmetric algebra Sym(X). 
Both extensions can be achieved by various concrete constructions, three of 
which we discussed in Section 4.9 for linearization and in Section 5.1 for alge- 
brization. In those sections, we claimed that it didn't matter which concrete 
construction we employed, since we can characterize our goal abstractly using 
a universal mapping condition. In this section, we explore universal mapping 
conditions at various levels of abstraction. 

9.1 Linearization via a universal condition 

Let's start with linearization, since it is both simpler and more familiar. 
Given an affine space A, can we characterize its linearization A abstractly 
and uniquely? The "uniquely" part turns out to be hopeless; the closest that 
we can come to "unique" while remaining abstract is "unique up to a unique 
isomorphism". Since we can't achieve absolute uniqueness, let's temporarily 
refrain from talking about "the linearization" and from writing A. Instead, 
let's try to abstractly characterize what it means for some linear space X to 
be "a linearization" of the affine space A. 

Since linearization is a process of extension, we expect a linearization X 
of A to include A as a subset; indeed, we naively want A to sit, inside X, as 
an affine hyperplane not containing the origin. Using that subset language, 
here is the universal mapping condition that a linearization must satisfy. 

Condition 9.1-1 For any affine space A, a linear space X D A is a lin- 
earization of A when every affine map j : A — > Y , from A to any linear space 
Y, extends uniquely to a linear map / : X — > Y. That is, there must exist a 
unique linear map / : X — > Y that agrees with j on the subset A of X. 



119 



120 



CHAPTER 9. UNIVERSAL MAPPING CONDITIONS 



While we naively expect that any linearization X of the affine space A 
will include A as a subset, it would be technically unfortunate to require 
the relationship A C X. For example, we exploited duality in Section 4.9 
to argue that the linear space Aff(A,R)* is a linearization of A. The space 
Aff (A, R)* does not include A as a subset; but it does contain the evaluate- 
at-P functional ep, for every point P in A, which is almost as good. To allow 
for these sorts of linearizations, let's stop requiring A C X and instead settle 
for an affine map i : A — > X that lets us view A as sitting inside of X. 

Condition 9.1-2 For any affine space A, a linear space X and an affine map 
i: A — > X, taken together, are a linearization of A when, for every linear 
space Y and every affine map j : A — > Y , there exists a unique linear map 
f:X^Y with foi = j. 

Condition 9.1-2 abstractly captures all of the concrete properties that we 
want a linearization to have. Let's first convince ourselves of that intuitively 
by considering those properties in turn. 

1. We want the affine map i: A — > X to be injective. (Indeed, we at first 
built in that requirement by demanding that A C X.) If there were 
distinct points P and Q in A with i(P) = i(Q), we could construct an 
affine map j: A — > Y with j(P) ^ j(Q), and then no map /: X — > Y 
could possibly exist — linear or not — with / o i — j. 

2. We don't want the image space i(A), which will be an affine subspace 
of X, to include the origin of X. If there were any point P in A with 
i(P) = 0, we could construct an affine map j : A — > Y with j(P) ^ 0, 
and then, since linear maps must take zero to zero, no linear map 
/: X — > Y could possibly exist with foi = j. 

3. It follows from the two previous points that the dimension of the linear 
space X must exceed that of A. We want dim(X) to be precisely 
dim(A) + 1, not larger. If dim(X) were larger, we could take any space 
Y of positive dimension and consider the zero map j : A — > Y . The 
condition / o i — j — 0 would require / to be zero on a subspace of X 
of dimension dim(A) + 1, but we would be free to map the remaining 
dimensions of X arbitrarily; so / would not be unique. 

But much more is true. Any structure that is defined by a universal 
mapping condition of the same flavor as Condition 9.1-2 is always uniquely 
determined, up to a unique isomorphism. The following proposition shows 
that in the particular case of Condition 9.1-2. But the structure of the 
argument is quite general, so the analogous result holds, equally well, for any 
structure that is defined by a universal mapping condition in this way. 
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Proposition 9.1-3 Let A be an affine space, and, for k equal to 1 and 2, let 
Xj; be a linear space and i& : A — > be an affine map. If both of the pairs 
(Xi,ii) and (X 2 ,i 2 ) are linearizations of A according to Condition 9.1-2, 
then there exists a unique isomorphism between X 1 and X 2 that makes the 
following diagram commute: 



Warning: The map h 2 i in this diagram goes from X\ to X 2 , rather than 
the reverse; that is, the subscripts should be read from right to left. That 
looks weird in the diagram, but works well in a formula such as x 2 = ^21(^1), 
where x\ belongs to X\ and x 2 to X 2 . More generally, with the subscripts 
in this order, the composition hij o hki makes sense just when the adjacent 
subscripts coincide, when j = k. Section C.l discusses the sad choices of 
convention that now force either our diagrams or our formulas to look wrong. 

Proof We first apply the universal mapping condition for (Xi,ii) to the 
pair (Y, j) := (X 2 ,i 2 ). We deduce that there exists a unique linear map 
h 21 : Xi — > X 2 with h 21 o % x = i 2 . Symmetrically, there exists a unique linear 
map h\2 '■ X 2 — > Xi with h\ 2 o i 2 — i\. 

It remains to verify that h 2 \ is an isomorphism with h\ 2 as its inverse, 
that is, that the compositions h 12 0 h 2 \ and h 2 i ° h± 2 are the identity maps 
on X 1 and X 2 , respectively. To prove the first of those claims, we apply the 
universal condition for (Xi,ii) to the pair (Y,j) := (Xi,ii). So there exists 
a unique linear map hn : X\ — > X\ with hu 0 H = H- The identity map on 
Xi is clearly one candidate for hn; but the composition h± 2 o h 2 i is another 
such candidate, since we have h 12 o h 21 o i ± — h 12 o i 2 — i±. Since the map 
hn is unique, the composition h± 2 o h 2 \ must be the identity on X 1 . The 
composition h 21 o h 12 must be the identity on X 2 symmetrically. □ 

Thus, if we define linearization using a universal mapping condition, it 
follows from purely formal reasoning that linearizations are unique, up to a 
unique isomorphism — if they exist at all. This formal reasoning leaves open 
the possibility that linearizations might not exist, however. To show that 
they do exist, we need a more concrete argument, one that appeals to the 
nature of affine and linear spaces. 

Proposition 9.1-4 If A is any affine space, there exists a linear space X 
and an affine map i: A — > X that are a linearization of A according to 
Condition 9.1-2. 



A 




2 
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Proof We need to construct a concrete linearization (X, i), by some method. 
The simplest method involves fixing a reference frame for A, so that's what 
we'll do. We don't have to worry about the resulting linearization depending, 
in some bad way, on which frame we happen to choose, since all linearizations 
of A are uniquely isomorphic. 

Let (Rq, . . . , Rd) be a barycentric reference frame for A. We construct X 
as the linear space that has the new atoms (x 0 , . . . , Xd) as a basis. We define 
the map i: A — > X by setting i(Rk) '■— f° r k in [0. .d], and then extending 
in the unique way that makes i affine; so we have i(t 0 Ro + • • • + t^Ra) '■= 
toXo + • • • + tdXd, for all real numbers to through td with to + • — h — 1. 

Note that the dimension of this linear space X is d + 1, as it should be, 
and that the image i(A), sitting inside X, is a hyperplane not containing the 
origin — to wit, the hyperplane t 0 + • — h tj, — 1. Those are indications that 
the pair (X, i) might be a linearization of A. But the true test comes from 
the universal mapping condition. 

To verify Condition 9.1-2, let Y be any linear space and j: A — > Y any 
affine map. If some set map / : X — > Y is to satisfy / o % = j, we must have 
f(xk) — f(i(Rk)) — j{Rk)-> f° r ah k in [0 . . d}. If we also require that the 
map / be linear, then those d+ 1 conditions determine a unique /, since the 
atoms (xo, . . . , x^) are a basis of X. Finally, with / determined in this way, 
the two affine maps / o % and j agree on a barycentric reference frame for 
A, so they agree on all of A. Thus, the concrete pair (X, i) does satisfy the 
universal mapping condition and is hence a linearization of A. □ 

So linearizations do exist; and they are automatically essentially unique, 
since they are defined by a universal mapping condition. It is then convenient 
to pretend that they are absolutely unique, introducing the notation A to 
denote "the linearization" of A. As we discussed in Section 4.9.5, this abuse of 
language is harmless as long as, whenever two different concrete linearizations 
of the same affine space A appear together in any argument, we use the 
unique isomorphism between them to identify each element of one with the 
corresponding element of the other. It is also convenient to pretend that A is 
actually a subset of the linearization A, so that we don't need to write down 
or even to name the underlying affine map % : A — > A. 

9.2 Algebrization via a universal condition 

The theory of algebrization is very similar to that of linearization, based on 
the following universal mapping condition. 

Condition 9.2-1 Given a linear space X, a commutative algebra G and a 
linear map i: X — > G are a commutative algebrization of X when, for every 
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commutative algebra H and every linear map j : X — > H, there exists a 
unique algebra homomorphism /: G — > H that satisfies / o % — j. 

Recall that an algebra homomorphism / : G — > H is a linear map that is also 
a ring homomorphism; so we have f (x + y) — f (x) + f(y), f(tx) = tf(x), 
f(xy) = f(x)f(y), and /(l) = 1, for all elements x and y of G and all real 
numbers t. 

Any two pairs (Gi,ii) and (G^,^) that are both commutative algebriza- 
tions of a common linear space X will be uniquely isomorphic. The formal 
statement of that claim and its proof are so similar to Proposition 9.1-3 that 
we content ourselves with drawing the relevant commutative diagram: 




G\*> — -G2 



While essential uniqueness comes for free, we have to do some concrete 
work to show that every linear space X does have a commutative algebriza- 
tion. Fortunately, that work is quite easy, since we already know a lot about 
polynomial algebras. 

Proposition 9.2-2 If X is any linear space, there exists an algebra G and a 
linear map %: X — > G that are a commutative algebrization of X, according 
to Condition 9.2-1. Furthermore, that algebra G has a grading in which the 
image i(X) coincides with the first graded slice Gi. 

Proof As in Proposition 9.1-4, any concrete construction that succeeds will 
suffice, so we needn't be afraid to choose a basis. Let (x 0 , . . . , x d ) be a basis 
for X, where we set d := dim(X) — 1, for consistency with our intended 
applications to the linear spaces A and A*, where d = dim(A). Let G denote 
the algebra G := R[i>o, • • • ,Vd] of all polynomials in the d + 1 variables vq 
through Vd- We define a linear map i: X — > G by setting i(x k ) := v k for k in 
[0 . . d] and then extending by linearity. 

Note that the algebra G = R[i>o, • • • , v d ] is graded by total degree. Under 
this grading, the first graded slice is the linear space Ri[i>o, • • • , Vd], consisting 
of all linear combinations of the variables. That linear space coincides with 
the image space i(X). 

It remains to verify Condition 9.2-1. So, let H be any commutative 
algebra and let j : X — > H be any linear map. If some set map f:G^H 
is to satisfy / o i — j, we must have f(vk) = f(i(xk)) = j(xk), for all k 
in [0 . . d\. Those conditions are just enough to determine a unique algebra 
homomorphism f:G—*H. To see this, consider any element y of G, so y is 
a polynomial in the variables (vq, . . . ,v d ). Suppose that, for each A; in [0 . . d], 
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we substitute j(xk) for v k in the polynomial y. The resulting expression will 
simplify to some element z of the algebra H . Any algebra homomorphism / 
that takes to j(x k ), for all k in [0 . . d\, must take y to z. And performing 
those substitutions does give an algebra homomorphism / : G — > H . Finally, 
since the linear maps / o i and j agree on the basis (xq, . . . , Xd) of X, they 
must agree on all of X. □ 

So every linear space X does have a commutative algebrization, which 
is automatically essentially unique. It is a convenient standard practice to 
pretend that this algebrization is absolutely unique. We follow that practice 
by talking about "the commutative algebrization" of X, which is written 
Sym(X) and called the symmetric algebra of X. We also pretend that X 
actually coincides with the first graded slice of its symmetric algebra Sym(X); 
that is, we identify X with its image i(X) = Sym 1 (X). 

In addition to the symmetric algebra Sym(X), which is commutative, 
there are several noncommutative algebrizations of a linear space X, as we 
discuss in the next section. But the symmetric algebra is the simplest, so we 
are lucky that the algebras of forms and sites are symmetric algebras. 

Exercise 9.2-3 Show that the following alternative universal mapping con- 
dition also characterizes the symmetric algebra Sym(X). 

Condition 9.2-4 Given a linear space X, a commutative graded algebra 
G = ® n>0 G n and a linear map % : X — > G\ are a commutative algebrization 
of X when, for every commutative graded algebra H = @ n>0 H n and every 
linear map j : X — > Hi, there exists a unique graded-algebra homomorphism 
f:G^H with / o % — j. 

9.3 Tensors 

Most math texts that construct the symmetric algebra Sym(X) use tensors, 
even though we have just seen that polynomials suffice. Polynomials suffice, 
in fact, even for linear spaces X of infinite dimension or over arbitrary fields. 
So why do math texts use tensors? There are two reasons. 

First, some texts want to deal with scalars that come, not from a field, 
but instead from some commutative ring R. The analog of a linear space, 
in this more general context, is called an R-module. If M is any i?-module, 
it is possible to construct a commutative i?-algebra Sym(M) that satisfies 
the appropriate universal mapping condition. But polynomials don't suffice 
to construct Sym(M); you need tensors. Indeed, the first step on the road 
to Sym(M) via polynomials would be to choose a basis for M; but only the 
nicest modules, the free modules, have bases. One of the reasons that linear 
spaces are simpler than modules is that all linear spaces are free. 
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To understand the other reason, we must broaden our sights to consider 
noncommutative algebras. There are at least four different ways to algebrize 
a linear space X, each characterized by its own universal mapping condition: 

• The symmetric algebra Sym(X) is the free commutative algebra gen- 
erated by X. 

• The tensor algebra T(X) = (^) X is the free algebra — not required to 
be commutative — generated by X. 

• The alternating (a.k.a. skew-symmetric, exterior, or Grassmann) alge- 
bra Alt(X) = f\ X is the free algebra generated by X in which elements 
of X skew-commute. 

• Finally, if X has an associated quadratic form Q: X — > R, the Clif- 
ford algebra Clif(X) is the free algebra generated by X in which every 
element x of X satisfies x 2 = —Q(x). 

Some math texts use tensors to build the symmetric algebra because they 
need to develop the machinery of tensors anyway, in order to construct some 
of these other algebras. 

9.3.1 The tensor algebra 

Recall that we characterized the symmetric algebra abstractly using either 
of two universal mapping conditions, Condition 9.2-1 or 9.2-4. The tensor 
algebra can be characterized abstractly using either of those conditions as 
well, just omitting the requirement for commutativity. 

To concretely construct the tensor algebra, people typically use tensor 
products. Recall that the tensor product of two linear spaces X and Y is 
a linear space X <g> Y of dimension dim(X) dim(y). The tensor product is, 
itself, abstractly characterized by a universal mapping property involving 
bilinear maps; but it would take us too far afield to review that. 

Given any linear space X, let X® n denote the tensor product 

X® n :=X®---®X 
s v ' 

n factors 

of X with itself n times. We can concretely construct the tensor algebra 
T{X) as the direct sum T{X) : = ® n>0 X® n . The tensor algebra is graded, 
but, as soon as dim(X) exceeds 1, is not commutative. In particular, if x\ 
and x-2 are two elements of X that are linearly independent, then x\ <S> X\, 
X\ <E> X2, X2 <E> x±, and £2 <8> ^2 are four linearly independent elements of 
X® 2 = X®X. If (60, . . . , bd) is a basis for X, then the products b^ ® • • -®bi n 
form a basis for X® n , where the subscripts i\ through i n vary independently 
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from 0 to d. Thus, an arbitrary element of X® n has an n-dimensional matrix 
of coefficients. 

Let Y := X* be the dual space of X. The formula 

(yi <E> • • • <E> y n , x x (g) ■ ■ ■ (g) x n ) = (y 1: x ± ) ■ ■ ■ (y n , x n ), 

determines a natural pairing map between Y® n = (X*)®" and X®", allowing 
each to represent the dual of the other. That formula is analogous to the 
Permanent Identity, but without any summing or averaging going on. 

The elements of the space X® n are called n-contravariant tensors on X, 
while the elements of Y® n = (X*)® n are n-covariant tensors on X. That 
is, "contravariant" here means "primal", while "covariant" means "dual"; 
this usage arose in physics, for reasons that Dodson and Poston explain [19]. 
Sad to say, this usage conflicts with category theory, where "contravariant" 
means "arrow-reversing", while "covariant" means "arrow-preserving". 

If we have already constructed the tensor algebra T(X), there are sev- 
eral ways to construct the symmetric algebra Sym(X) as a by-product. One 
scheme realizes an element of Sym(X) as an equivalence class of tensors, 
under an equivalence relation that makes multiplication commutative. More 
precisely, we construct Sym(X) as the quotient T(X)/I, where I is the small- 
est two-sided ideal in T(X) that contains xi <g> x 2 — x 2 <S> xi, for all x± and x 2 
in X. A second scheme looks, inside T(X), at the subset formed by the sym- 
metric tensors, the ones whose n th homogeneous components have coefficient 
matrices that are symmetric under all permutations of their n dimensions. 
The second scheme constructs Sym(X) by equipping that set of symmetric 
tensors with a new, symmetrized multiplication. 

Because people read about these schemes in textbooks, they sometimes 
end up believing that the symmetric algebra is, in some deep sense, an alge- 
bra of tensors. Indeed, I fell into this trap myself when I claimed that the 
algebra of sites is built using "the symmetric variant of the tensor-product 
construction" [42, 43]. We can build Sym(X) in that way, if we like; but 
tensors are overkill. It is more true to say that the symmetric algebra is an 
algebra of polynomials. (And the real truth, of course, is that the symmetric 
algebra is anything that satisfies the universal mapping condition.) 

Lest any confusion on this point linger, keep in mind that the algebra of 
sites is dual to the algebra of forms. Since we don't need tensors to build the 
algebra of forms, we don't need them to build the algebra of sites either. 

9.3.2 The alternating algebra 

Let X\ and x 2 be linearly independent elements of the linear space X. In 
the tensor algebra T(X), the products x\ (g) x 2 and x 2 <8> X\ are linearly 
independent. In the symmetric algebra Sym(X), the products x\x 2 and 



9.3. TENSORS 



127 



X2X1 are equal. In the alternating algebra Alt(A), we arrange that x\ A12 = 
— (x 2 Axi). That is, the multiplication in the algebra Alt(A) skew-commutes. 
Of course, we can't possibly have y A z = — (z A y) for all elements y and z of 
the algebra Alt (A); for example, multiplication by the identity commutes in 
any algebra: We always have lAy — y — y Al. Rather, a graded algebra G is 
called skew-commutative (a.k.a. alternating) when its multiplication satisfies 



for all homogeneous elements y and z in G. So it is homogeneous elements 
y and z of odd degree that satisfy yz = —zy. 

Math remark: Over a field of characteristic 2, we have 1 = —1, so skew- 
commutativity as we have just defined it would reduce to commutativity. 
Instead, skew-commutativity is defined to require both Identity 9.3-1 and 
the identity y 2 = 0, for all homogeneous elements y of odd degree. Note 
that, when y is homogeneous of odd degree, Identity 9.3-1 gives us y 2 = —y 2 , 
which implies y 2 = 0 whenever 2^0. 

To characterize the alternating algebra abstractly, we can use a universal 
mapping condition similar to Condition 9.2-4; we simply replace the com- 
mutative graded algebras in that condition with skew-commutative graded 
algebras. But an algebra has to be graded before we can require it to be skew- 
commutative. Therefore, the simpler Condition 9.2-1, which doesn't mention 
any grading, cannot be adapted to characterize the alternating algebra. 

As for constructing the alternating algebra concretely, recall that there 
are two methods for constructing the symmetric algebra Sym(A) from the 
tensor algebra T(X), one by taking a quotient, the other by extracting a 
subset. Both of those methods can be adjusted to produce the alternating 
algebra Alt (A) instead. The first method constructs Alt (A) as the quo- 
tient T(A)/J, where J is the smallest two-sided ideal in T(X) that contains 
x <E> x, for all x in X. The second method looks, inside T(X), at the set 
of skew- symmetric tensors, where an n-dimensional matrix of coefficients is 
called skew-symmetric (a.k.a. alternating) when it is skew-symmetric in all 
(2) pairs of dimensions. We can construct Alt (A) by equipping the set of 
skew-symmetric tensors with a new, skew-symmetrized multiplication. 

Math remark: Defining skew-symmetry in characteristic 2 has the same wrin- 
kle that arose in defining skew-commutativity. In order to be skew-symmetric 
in characteristic 2, a matrix (m^) must also be zero on the diagonal; that is, 
we must have ma = 0, as well as = —m^. 

Let's denote the n th graded slice of the alternating algebra Alt(A) as 
Alt„(A); another good notation would be A An , but we want to emphasize 



(9.3-1) 




deg(y) deg(z) 
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the analogy between Alt(X) and Sym(X). Let's also set d := dim(X) — 1, 
since we have in mind applying this theory when X is either A or A*, 
with A an affine d-space. If (b 0 , . . . ,bd) is a basis for X, then the products 
(b^ A • • ■ A &i n )o<u<---<i„<<2 form a basis for Alt n (X). Since equal subscripts 
are forbidden, the linear space Alt n (X) is smaller than Sym„(X); in partic- 
ular, we have dim(Alt„(X)) = while dim(Sym n (X)) = ( d +"). Indeed, 
the whole alternating algebra is finite-dimensional; we have dim(Alt(X)) = 
2 d + 1 — 2 dim ( x ) 

Let Y := X* denote the dual space of X. In choosing pairing maps 
between Alt n (X) and Alt n (Y), we meet once again that contentious factor 
of n! . There is a unique pairing between A±t n (Y) = Alt n (X*) and Alt„(X) 
that satisfies the following Summed Determinant Identity: 

(9.3-2) (yi A • • • Ay n ,xi A • • • Ax„> = ^ sgn(z/) j j (y k ,x u(k) ). 

Kk<n 

The summation index v here varies over the symmetric group S n of all n! 
permutations of the integers from 1 to n, while sgn(z/), the sign of is, denotes 
1 if v is an even permutation and — 1 if v is odd. That sum is precisely the 
determinant of the n-by-n matrix whose (i, j) th entry is (yi, Xj). There is also 
a unique pairing that satisfies the Averaged Determinant Identity, which is 
the same, except divided by n\ . As for whether summing or averaging is 
better in the skew-symmetric context, let's be glad that we have no current 
need to decide. The tradeoffs will be different than in the symmetric case. 
For example, note that the skew-symmetric n th power 

P An := P A ■ ■ ■ A P 

n factors 

of a point P is zero as soon as n exceeds 1; so who cares whether it gets 
divided by n\ or not. 

When studying the alternating algebra itself, an element of the linear 
space Alt„(X) is typically called an n-vector over X, while an element of 
A\t n (X*) is an n-covector. One important application of the alternating 
algebra is to calculus on manifolds, however, and that application has its own 
nomenclature. Recall that a vector field is a map that assigns, to each point 
in a manifold, a vector in the tangent space at that point. If we assign an 
n-covector on that tangent space instead, we get a field of n-covectors; such 
a field is, unfortunately, called a differential n-form. The unfortunate aspect 
of this term is the further overloading of the word "form". A differential 
n-form on a smooth manifold has nothing to do with an n-form on a linear 
space; indeed, the multiplications that underlie those two types of n-forms 
skew-commute and commute, respectively. As for why differential n-forms 
are the proper things to integrate over an n-manifold, the appearance of the 



9.3. TENSORS 



129 



determinant in Equation 9.3-2 provides an indication, since that determinant 
expresses the ratio between two measures of volume. 

The alternating algebra has other important applications to geometry. In 
particular, a lineal n-vector x\ A • • • A x n corresponds to an oriented volume 
form on the subspace of X spanned by the vectors x\ through x n . If we 
identify two such volume forms that differ by a scalar multiple, we get a 
rule that associates, with each n-dimensional subspace of X, a line through 
the origin of the space Alt„(X). The set of all n-dimensional subspaces of 
the linear space X is a Grassmann manifold, and this rule gives us a way 
to realize that manifold as a variety in a projective space — essentially the 
variety formed by those n-vectors over X that are lineal, that is, that can be 
written as the wedge product of n vectors in X. This is why an alternating 
algebra is also known as a Grassmann algebra. If we identify volume forms 
that differ by positive scalar multiples, but distinguish between those that 
differ by negative scalar multiples, we get an oriented version of this theory, 
as Stolfi explains [46]. In the oriented theory, the product PAQ of two points 
P and Q represents the oriented line from P toward Q. 

9.3.3 The Clifford algebra 

Some linear spaces come to us equipped with quadratic forms. For example, 
a Euclidean space has a positive definite inner product, while a Minkowski 
space has an indefinite but non-degenerate metric tensor. When the geometry 
associated with that quadratic form is important to us, it may help to extend 
that linear space into yet another algebra: a Clifford algebra. In particular, 
researchers in CAGD have had good success recently explaining Pythagorean- 
hodograph curves using Clifford algebras [10]. 

Some words about quadratic forms versus bilinear forms. Given any 
quadratic form Q: X — > R on a linear space X, there is always a bilin- 
ear form B:IxI^R that satisfies the identity Q(x) = B(x,x). And, 
since the characteristic of the real numbers R is not 2, we can determine 
the bilinear form B uniquely by requiring that B be symmetric. The inner 
product on a Euclidean space and the metric tensor on an Minkowski space 
are often thought of as symmetric, bilinear forms. In defining the Clifford 
algebra, however, all that we use directly is the diagonal Q of B. 

Roughly speaking, the Clifford algebra Clif(X) is the free algebra that 
contains X as a subspace and in which we have x 2 = —Q(x), for all elements 
x of X. Thus, Clifford-multiplying the vector x times itself results in the 
particular scalar — Q(x). Note that the alternating algebra is a special case 
of a Clifford algebra, the special case in which Q = 0, so x A x = —Q(x) = 0. 
Indeed, the Clifford algebra always has the same dimension as the alternating 
algebra; but the multiplication in the Clifford algebra has, embedded inside 
it in some way, the geometry associated with the form Q. 
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Clifford algebras don't fit well into the structures of this monograph, so 
we won't say much about them. In particular, beware that a Clifford alge- 
bra is not graded, under our definition of that term from Section 2.4. Like 
the alternating algebra Alt(X), the Clifford algebra Clif(x) is the direct sum 
Clif(X) = © 0 < n < d+1 Cm n (X) of subspaces, where dim(Clif n (X)) = 
for a total dimension of 2 d+l = 2 dim ( x ) . But the multiplication of the Clifford 
algebra does not send Clifj(X) x Clifj(X) into Clif i + j(X). Instead, the Clif- 
ford product of an element of Clif i{X) with an element of Clifj(X) is a linear 
combination of elements of Clif fc (X), for k in {i + j, % + j — 2, i + j — 4, . . . , 
\i — The components of the product in these different "grades" reflect 
different geometric relationships among the factors. For example, if x and y 
are vectors in a Euclidean 3-space X, the grade-2 part of the Clifford product 
xy corresponds to the cross product x x y (more precisely, it is an oriented 
volume form on the plane that x and y span), while the grade-0, scalar part 
of xy is minus their dot product. 

We can present one possible concrete construction of the Clifford algebra, 
based on ideas that we've already covered. One way to build Clif(X) is as 
T(X)/K, where K is the the smallest two-sided ideal of the tensor algebra 
T(X) that contains all elements of the form (x <g> x) + Q(x), for x in X. For 
a more concrete way to construct the Clifford algebra and for much other 
wisdom, read Porteous [41]. 

By the way, the theory of Clifford algebras involves a contentious choice, 
rather like our annoying factor of n! , except that it's an annoying factor of 
— 1. Some authors require x 2 = Q(x), rather than a: 2 = — Q(x). But including 
the minus sign ends up singling out the positive definite forms Q as being 
of special interest, and that seems more appealing than the alternative of 
singling out the negative definite ones. 
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Some Category Theory 



Category theory provides a general framework for defining things by universal 
mapping conditions, as happens when we linearize an affine space or algebrize 
a linear space. In particular, linearization and algebrization turn out to be left 
adjoints of forgetful functors. We here review linearization and algebrization 
from the enlightening perspective of category theory. 

A.l Fixing some type errors 

Let's start by fixing up some type errors in our discussion of linearization. 
Let A be an affine space. We defined a linearization of A to be a linear space 
X and an affine map i: A — > X that satisfied a certain universal mapping 
condition, Condition 9.1-2. But it is a type error for the codomain of an 
affine map to be a linear space. The expression % : A — > X makes sense only 
if we implicitly perform a type conversion on the codomain, replacing the 
linear space X by that same set of points, but viewed as an affine space. 
To perform that type conversion, we take the linear space X and we forget 
about where the origin is; let's denote the resulting affine space as T(X), 
on the grounds that forgetting the origin is like performing some arbitrary 
translation. Affine combinations tiXi + ■ ■ ■ t m x m with t\ + • • • + t m — 1 still 
make sense in the affine space T(X), and they have the same values that they 
had in the linear space X. But linear combinations with t\ + • • • + t m ^ 1 no 
longer make sense in T(X). 

If we insert the type-conversion operator T appropriately, we can define 
what it means to be a linearization in a completely type-correct manner. 

Condition A. 1-1 For any affine space A, a linear space X and an affine map 
%: A — > T(X) are a linearization of A when, for every linear space Y and 
every affine map j: A — > T(Y), there exists a unique linear map /: X — > Y 
with T(f) o i = j. 
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To make the final equation type-correct, we had to apply the type-conversion 
operator T to the linear function /: X — > Y, thereby converting it into an 
affine function T(f) : T(X) -> T(Y). The two functions / and T(f) are 
identical as set maps; that is, they take the same input points to the same 
output points. The difference between them is purely type-theoretic: The 
map / is a linear map between linear spaces, while T(f) is an affine map 
between affine spaces. 

Since T applies both to spaces and to maps, it isn't an operator; to say 
what it really is, we need some category theory. A category C is 

1. a collection of objects; 

2. for each pair of objects X and Y, a collection of arrows C(X,Y), 
referred to as arrows from X to Y; 

3. for each object X, a special arrow in C(X,X), called the identity on 
X; and 

4. for each triple of objects X, Y, and Z in C, a composition operation 
o: C(Y,Z) x C(X,Y) -> C(X,Z), taking the arrows /: X -> Y and 
g: Y — > Z to an arrow gof:X^Z. (Note the clumsiness that results 
from composing from right-to-left, as we discuss in Section C.l; but 
that clumsiness is standard in category theory.) 

These structures must satisfy a few axioms: 

1. Identity arrows must behave as identities for the composition operator 
on arrows. 

2. The composition on arrows must be associative. 

In many important categories, the objects are spaces with some structure, 
while the arrows are maps between spaces that preserve this structure. For 
example, there is a category AfF, whose objects are affine spaces and whose 
arrows are affine maps. Similarly, there is a category Lin of linear spaces 
and linear maps, as well as a category Set of sets and set maps. 

A functor is a structure-preserving map from one category to another. So 
a functor T from C to D must map each object X of C to some object J~(X) 
of D and must map each arrow / : X — > Y of C to an arrow J-{f) : J~{X) — > 
J~(Y) of D. A functor is also required to map identity arrows to identity 
arrows and to commute with composition, so that T{g of) — J-{g) o J~{f)- 

Using this language, we can recognize the operator T above as a functor 
from Lin to AfF of a particularly simple type: It converts linear spaces into 
affine spaces by forgetting the origin, and it converts linear maps into affine 
maps by forgetting the origins in both the domain and codomain. Functors 
of this type are called forgetful functors. 
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A. 2 Linearization as a functor 

Introducing the forgetful functor T from Lin to Aff has allowed us to fix 
the type errors in our definition of a linearization. The resulting concept is 
perfectly respectable in category theory, where an arrow i: A — > T(X) that 
satisfies Condition A. 1-1 would be referred to universal from A to T, that 
is, universal from the object A to the functor T. 

But there is an important property of linearization that we have not yet 
captured. So far, we have been treating the parameter A as a fixed afline 
space. That suggests that our ability to linearize A might depend upon 
some special property of A; perhaps some afline spaces can be linearized, 
but others cannot. No. One of the good things about linearization is that 
we can linearize any affine space. Even more, we can extend any affine map 
/ : A — > B between affine spaces uniquely into a linear map / : A — > B 
between the corresponding linearizations. Thus, the process of linearization 
is actually a functor in the other direction, from Aff back to Lin. Instead of 
writing hat accents, let's refer to this new functor as C So, given any affine 
space A, the functor C produces for us a linear space C(A); and, given any 
affine map /: A — > B, we get a linear map C(f) : C(A) — > C(B). 

The universal mapping condition is now revealed to be some flavor of 
pseudo-inverse relationship between the two functors C and T. But note 
that C is not even a one-sided inverse of T, on either side. We can't have 
C(T(X)) = X for any linear space X, nor can we have T(C(A)) = A for any 
afline space A, because £ increases the dimension by one, while T preserves 
the dimension. We also can't have C(T(C(A))) = C(A) or any other such 
identity in which one pair of functor applications drops out. Instead, the 
pseudo-inverse relationship between C and T is of a different nature. 

A. 3 Left adjoints 

Let's restate the universal mapping condition, using both the forgetful func- 
tor T and the linearizing functor C — in particular, replacing X by C(A): 

Condition A. 3-1 For every afline space A, the linear space C(A) and the 
afline map %a'- A — > T(C(A)) have the property that, for every linear space 
Y and every afline map j: A — > T(Y), there exists a unique linear map 
f:C(A) -V witl,Ti/)o M j. 

Rather than defining what it means to be a linearization of a particular affine 
space A, this condition expresses the pseudo-inverse relationship that holds 
between the functors T and L. The key to that pseudo-inverse relationship 
is a certain system of one-to-one correspondences between sets of arrows. 
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Treating the affine space A and the linear space Y as free parameters, 
consider the set AS (A, T(Y)) of all affine maps from A to T(Y) and the set 
Lm(£(A),Y) of all linear maps from C(A) to Y. Condition A. 3-1 gives us a 
map from the former to the latter; for every j in AS(A,T(Y)), there exists 
a unique / in Lm(C(A),Y). We also have a map the other way; given any 
/ in Lin(C(A), Y), applying the functor T to f and then composing with %a 
on the right gives us a map T(/) o i A , which lies in Aff (A, T(Y)). The final 
equality in the universal condition tells us that those forward and backward 
maps are inverses. Thus, for every affine space A and every linear space Y, 
we have a one-to-one correspondence between 

(A.3-2) Aff (A, T(Y)) ^ a ,y Lin(£(A), Y). 

It is those correspondences that lie at the heart of the relevant pseudo-inverse 
relationship. The functor C is said to be left adjoint to T — or, equivalently, 
the functor T is right adjoint to C — when these correspondences exist, 
for all A and Y, and when they behave naturally with respect to affine 
and linear maps, as we'll discuss below. The resulting relationship is called 
an adjunction. Note that £ is a left adjoint because it appears in a left- 
hand argument in Correspondence A.3-2, while T is a right adjoint because 
it appears in a right-hand argument. (The meanings of "left adjoint" and 
"right adjoint" thus depend, unfortunately, on our convention that functions 
compose from right to left. Perhaps more semantic terms, such as source 
adjoint and target adjoint, would be a better idea?) 

What happens if we set the linear space Y, in Correspondence A.3-2, 
to be C(A)7 We then get the set Lm(C(A), C(A)) on the right, which is 
interesting because that set has a distinguished element: the identity on 
C(A). The arrow on the left that corresponds to that identity on the right 
is a special affine map from A to T(C(A)). In fact, it is the map that we 
denoted ia in Condition A. 3-1; it shows how the affine space A sits, as an 
affine hyperplane, inside its linearization C(A). The maps ia are called the 
units of the adjunction. 

For completeness, we should mention that an adjunction has counits as 
well, which are linear maps in our example. To find the counits, we set A to 
T(Y) in Correspondence A.3-2 and then map the identity on the left over to 
the right. The result is a special linear map cy : C(T(Y)) — > Y. If we convert 
a linear space Y into an affine space by forgetting about its origin and then 
re-linearize it by adding a new, external origin, the counit Cy projects out the 
newly added dimension, projecting along lines that are parallel to the line 
joining the new origin to the old. Note that the units in our example go from 
simple to complicated, while the counits go from complicated to simple. As 
for figuring out whether C or T gets applied first, that depends upon which 
is the left adjoint and which is the right adjoint. 
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A. 4 Behaving naturally 

Given two categories and two functors that go back and forth between them, 
an adjunction exists between the two functors when there is a system of one- 
to-one correspondences as in Correspondence A. 3-2 that behaves naturally 
with respect to arrows in the two categories. For completeness, we sketch 
briefly in this section what it means to "behave naturally" . For more details, 
see Mac Lane's fine text [40]. 

Let g : Y — > Z be some arrow in the category Lin. The adjunction gives 
us the horizontal one-to-one correspondences in the following diagram: 

A$(A,T(Y)) < >a,y Lm(C(A),Y) 

AS(A,T(Z)) ^ AZ Un(C(A),Z) 

The vertical maps come from g. Given any linear map from L(A) to Y, as 
on the top right, we can apply first that map and then g to get a linear 
map from L(A) to Z, as on the bottom right; the vertical arrow labeled 
represents that "follow up with g" process. The arrow on the left labeled 
T(g)* represents the analogous "follow up with T(g)" process. In order to 
behave properly with respect to linear maps, this diagram must commute. 

Behaving properly with respect to affine maps is similar. Let h: A — > B 
be some arrow in the category Aff . The adjunction gives us the horizontal 
correspondences in this diagram: 

AS(B,T(Y)) < ^ b,y Lin(£(£),y) 



h* 



c(hy 



AS(A,T(Y)) < >a,y Lm(C(A),Y) 

Given any affine map from B to T(Y), as on the top left, we can apply first 
h and then that map to get a map from A to T(Y), as on the bottom left; 
the vertical arrow labeled h* represents that "prepare with h" process. The 
vertical arrow labeled C(h)* represents the analogous "prepare with C{h) v 
process. That diagram must also commute. 



A. 5 A ladder of adjunctions 

Enough, already, of category theory. What does category theory tell us about 
the process of algebrization that underlies the paired-algebras framework? 

Let CAlg denote the category whose objects are commutative algebras 
and whose arrows are algebra homomorphisms. Given any commutative 
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Figure A.l: A ladder of adjunctions 



algebra, if we forget how to multiply, we are left with a linear space; thus, 
there is an obvious forgetful functor from CAlg to Lin. Let's denote that 
functor Jvi, on the grounds the it forgets how to multiply. The process of 
algebrization is precisely a functor A that is left adjoint to the forgetful 
functor M. That is, writing A(X) for the symmetric algebra Sym(X), we 
have one-to-one correspondences 



Figure A.l shows a ladder of adjunctions of which we have explained 
the top two steps. On the right, we have forgetful functors going down, 
M: CAlg — > Lin and T: Lin — > Aff. On the left, their left adjoints go 
back up, linearization C: Aff — > Lin and algebrization A: Lin — > CAlg. 

It is helpful to add on one more step at the bottom of the ladder. Let U 
denote the forgetful functor from Aff to Set, the functor that takes an affine 
space and forgets everything about it except for the underlying set of points 
- U for underlying. The left adjoint of U is the functor Q that, given any 
set S, produces the "affinization" or "geometrization" of S, an affine space 
for which the points in S form a barycentric frame. Note that we again have 
one-to-one correspondences 



So we have a ladder with four rungs, each adjacent pair of rungs connected 
by a forgetful functor, going down, and its left adjoint, going back up. Going 
down is always easy. Stepping up from Set to Aff is also conceptually 
easy, although the resulting affine spaces can be quite large. The other two 
upward steps, linearization and algebrization, are more subtle. But note that 
jumping up from Set to Lin, taking two steps at once, is easy; given a set S, 
the linear space C(Q(S)) is simply a linear space that has S as a basis. And 



Lm(X, M(G)) 



*x,g CA\g(A(X),G). 



Set(S,U(A)) AS(Q(S), A). 
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jumping all the way up the ladder from Set to CAlg is easy as well; given 
a set S, the commutative algebra A(C(Q(S))) is simply the algebra H[S] of 
all polynomials whose variables lie in S. If we jump up from a set S to any 
rung, we simply get the free thing of that type that is generated by S: the 
free afhne space with S as a barycentric reference frame, the free linear space 
with S as a basis, or the free commutative algebra with S as its generating 
variables. The subtlety comes only when we have already stepped up and we 
want to step farther up. 

Indeed, recall that the easiest concrete construction that we found for 
linearizing an affine space A involved choosing a barycentric reference frame 
for A and then forming C(A) as the linear space with those frame points as 
a basis. In terms of the ladder in Figure A.l, choosing a barycentric frame 
for A means finding some set S with Q(S) = A. Note that this is not at 
all the same as computing U(A), since the functors Q and U are adjoints, 
not inverses. Having found some S with Q(S) = A, we can then jump up 
from Set to Lin in one easy step, by forming the linear space that has S as 
a basis. Thus, jumping up from the ground is so easy that we use it as a 
subroutine when stepping up one rung. 

Given a linear space X, jumping up from Set is also the easiest way to 
construct the symmetric algebra A(X) = Sym(X). We choose some basis 
for X; that is, we find some set S with X = C(Q(S)). We then construct 
A(X), in a big jump back up, as the polynomial algebra H[S]. 

This ability to step up by backing down to ground level and then jumping 
one higher is a special property of the particular ladder of adjunctions in 
Figure A.l. For example, we couldn't employ the same strategy to step up 
above CAlg, if we added a new rung at the top of the ladder, since not 
every commutative algebra is the polynomial algebra generated by some set 
of variables. One of the things that make Aff and Lin simple is that every 
object has a frame or basis; but that doesn't hold for CAlg. 

Exercise A. 5-1 In preparation for the day when locations over A might 
join sites over A as basic objects in CAGD, so that we can divide by points 
as well as multiply by them, extend the adjunction ladder of Figure A.l with 
one more rung at the top. 

Hint: First replace the former top rung CAlg by the smaller category 
EAlg of entire algebras, that is, commutative algebras that are free of zero 
divisors and that hence, when viewed as rings, are integral domains (a.k.a. 
entire rings). And verify that the functors M and A form an adjunction also 
between Lin and EAlg. We can then add, as a new top rung, the category 
RFld consisting of those fields that include the real numbers as a subfield. 
The forgetful functor V: RFld — > EAlg forgets how to divide, while its left 
adjoint Q: EAlg — > RFld forms quotients. 



138 



APPENDIX A. SOME CATEGORY THEORY 



A. 6 Injective units 

All of the left adjoints that we have been studying (including the functor Q 
in Exercise A. 5-1) have the special property that they only add in new, good 
stuff; they don't squash out any old, bad stuff. In general, left adjoints of 
forgetful functors may have to do some of both. 

For an example of squashing out old, bad stuff, consider the category 
Grp, whose objects are groups and whose arrows are group homomorphisms. 
And consider the subcategory Ab, with only the abelian groups. There is 
an obvious forgetful functor J- ': Ab — > Grp; it takes an abelian group H to 
that same group J~{H) := H, but now viewed as a general group; that is, 
the functor JF forgets the axiom for commutativity. 

The forgetful functor T has a left adjoint C. How does C work? Given 
an arbitrary group G, we must produce, in some sense, the free abelian 
group generated by G. Let [G, G] denote the commutator subgroup of G, 
the subgroup generated by all elements of the form xyx~ 1 y~ 1 , for x and y in 
G. The commutator subgroup [G, G] is normal, so we can form the quotient 
group G/[G, G], which will be abelian. That quotient is C(G). In particular, 
we have the required one-to-one correspondences: 

Grp(G,F(H)) < *g,h Ab(C(G),H). 

Thus, the left adjoint C produces an abelian group C(G) from an arbitrary 
group G by squashing out any bad, nonabelian stuff in G. 

To determine whether a particular left adjoint adds in new, good stuff, 
squashes out old, bad stuff, or does some of both, we examine the units of 
the adjunction. An injective unit had no need to squash out any old, bad 
stuff, while a surjective unit had no need to add in any new, good stuff. All 
of the adjunctions in the ladder of Figure A.l (even when extended as in 
Exercise A. 5-1) have injective units, while the unit that maps an arbitrary 
group G to the "abelianized" group F(C(G)) = G/[G,G] is surjective. 

A. 7 Right adjoints 

Linearization and algebrization can be viewed as left adjoints of forgetful 
functors; but what about right adjoints of forgetful functors? Do they ever 
arise? I don't know of any examples in CAGD; but, for completeness, here 
is a natural forgetful functor in topology whose left adjoint and right adjoint 
are both useful. 

Let Top be the category whose objects are topological spaces and whose 
arrows are continuous maps. There is an obvious forgetful functor JF from 
Top to Set, the functor that forgets the topology. An adjoint of that forgetful 
functor must take an arbitrary set W and invent some topology for it. 
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If S is to be the left adjoint of J 7 , we must have one-to-one correspon- 
dences 

Set(W,F(U)) ^ w ,u Top(S(W),U). 

That is, any set map from W to any topological space U must, when we 
equip W with the topology that S chooses for it, turn out to be continuous. 
So S must equip W with the discrete topology, the topology in which every 
subset of W is open. 

On the other hand, if T is to be the right adjoint of J 7 , we must have 
one-to-one correspondences 

Set(F(U), W) < — > UtW Top(C7, T(W)). 

That is, any set map from any topological space U to W must, when we 
equip W with the topology that T chooses for it, turn out to be continuous. 
So T must equip W with the trivial topology, the topology in which only W 
and the empty set are open. 

A. 8 Tensor-product surfaces revisited 

Gentle reader, you will have to decide for yourself whether the unification, 
systematization, and type correctness provided by category theory are worth 
its conceptual overhead. But we can at least mention one instance where 
category theory would have assisted us in this monograph. That instance 
involves the tensor-product surfaces in Section 6.8, which turn out to be 
related to the notion of coproducts in category theory. 

Products and coproducts are category-theoretic notions that may or may 
not exist in a given category. They are abstractly characterized by universal 
mapping conditions; indeed, the product, when it exists, is the right adjoint 
of a certain diagonal functor, while the coproduct, when it exists, is the left 
adjoint of that same functor. For the details, see Mac Lane [40]. 

Products and coproducts exist in all four of the categories that are the 
rungs on the adjunction ladder in Figure A.l. In the category Set, products 
are Cartesian products and coproducts are disjoint unions. In Aff , products 
are again Cartesian products, while coproducts are affine hulls; that is, to 
form the coproduct A II B of two affine spaces A and B, we position them 
in some larger affine space so as to be skew and so that no line in A is 
parallel to any line in B and we then take the affine hull of their union. 
We thus have dim(A II B) = dim(A) + dim(_B) + 1. In Lin, products are 
Cartesian products, while coproducts are direct sums. Note that Lin is one 
of the unusual categories where (finite) products and (finite) coproducts are 
isomorphic; we have X x Y = X © Y , for linear spaces X and Y. Finally, in 
C Alg, products are Cartesian products while coproducts are tensor products. 
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On our adjunction ladder, coproducts are more interesting than products, 
since they correspond to different notions in each of the four categories. 
Those notions are tied together by the general theorem that any functor 
that is a left adjoint preserves coproducts. Note that the geometrization, 
linearization, and algebrization functors Q, C, and A are all left adjoints of 
forgetful functors. So we get three theorems for free: 

1. If S and T are sets, the affine space Q(S II T) that has the disjoint 
union S II T as a barycentric frame is the affine hull G{S) II Q{T) of 
the affine spaces that have S and T as frames. 

2. If A and B are affine spaces, the linearization C(A II B) of their affine 
hull is the direct sum C(A) © C(B) of their linearizations. 

3. If X and Y are linear spaces, the algebrization A{X@Y) of their direct 
sum is the tensor product A(X) © A(Y) of their algebrizations. 

The third of those theorems helps to explain why we build tensor-product 
surfaces as we do. Starting with the direct product L\ x L 2 of two affine 
parameter lines L\ and L2, the obvious thing to do would be to linearize that 
entire plane at once. But that result C(Li x L2) wouldn't be decomposable 
into parts associated with L x and L 2 , since linearization doesn't preserve 
products; indeed, it would be 3-dimensional. So we instead linearize each 
affine parameter line separately, getting the linear space C(L±) x C(L 2 ) = 
L\ x L 2 . At this point, we take advantage of the unusual property of Lin to 
convert the product into a coproduct, rewriting L\ x L 2 as the direct sum 
Li@L 2 . The third theorem from the list above then does the algebrizing for 
us: We have Sjm(Li © L 2 ) = Sjm(Li) © Sym(I/ 2 ), a tensor-product algebra 
of sites. Tensor-product surfaces get their name from the dual algebra of 
forms, which is the tensor product Sym(L*) © Sym^L^) in a similar way. 

Exercise A. 8-1 Exercise A. 5-1 extends the ladder of adjunctions, replacing 
C Alg with EAlg and then adding RFld as a new top rung. Do the categories 
EAlg of entire algebras and RFld of fields that include the real numbers as 
a subfield have products? Have coproducts? 

Hint: The tensor product is a coproduct in the category EAlg, just as in 
CAlg. The category RFld also has coproducts, which can be constructed by 
first forming the tensor product and then forming quotients. But the direct 
product of two algebras always has zero divisors, since we have (1, 0) • (0, 1) = 
(0, 0) = 0; so the categories EAlg and RFld do not have products. 
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(Often the coefficient l/n\ is put in front of the formula for i\ this 
makes no essential difference, but leads to awkward formulas for 
contractions.) - William Fulton and Joe Harris [22] 

In this appendix, we analyze which pairing leads to a superior theory, the 
summed pairing or the averaged pairing. Fulton and Harris, the authors of 
the quote above, are pure mathematicians who adopt the summed pairing. 
When they promptly stumble across an annoying factor of n\ , they analyze 
the averaged pairing as an alternative in the single pithy sentence quoted 
above. We here consider the issues at far greater length, on our way to the 
same conclusion. 



B.l Searching for pretty formulas 

One way to judge which pairing is better is to see which leads to the simpler 
formulas. Table B.l gives a good sample of formulas, both in their summed 
and their averaged versions. 

The third line in Table B.l points out a minor problem with averaging. In 
the averaging column, we find that D n f(P) = (f, nuP^ 1 ). If I were teaching 
this theory to students, I would worry that my students would rewrite this 
formula as D n f(P) = (/, imP n ~ 1 ) and would then justify the annoying factor 
of n to themselves by recalling that 

(B.M) ^ (n= „P»-^. 

That justification is a delusion based on a coincidence. We are differentiating 
the n-form / in this formula, not the n-site P n . If you are still in any doubt, 
compare with the blossomed formula in the line below, where there is no 
pn-i £ 0 rem i nc [ us of the valid but irrelevant Equation B.l-1. 
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summing 



averaging 



evaluate an n-form / at a point P 




£ ( 7~)\ 1 £ Ti l ) \ 

fin = (f,p ) 


evaluate the blossom / of an n-form / 
at the points {P\, . . . , P n ) 


f(P 1 ,...,P n ) = (f,P 1 ---P n /n\) 


j(p 1 ,...,p n ) = (f,p 1 ---p n ) 


differentiate an n-form / in the direc- 
tion of the vector n and evaluate the 
resulting (n — l)-form at P 


£> w /(P) = (/,7rP»- 1 /(n-l)!> 


D n f(P) = (f,nnP n - 1 ) 


differentiate an n-form / in the direc- 
tion of the vector 7r and evaluate the 
blossom or the resulting [n — lj-form 
at the points (Pi, . . . , P n -\) 


(^/r^.-^n-i) 

= (/,7rPi...P n _i/(n-l)!> 


(Ajr(Pi,...,Pn-i) 
= {f,nirP 1 ---P n - 1 ) 


differentiate an n-form / in the direc- 
tion of the vector it 




Ar/ = ft-niT 


differentiate an n-form k times, in the 
directions of the vectors 7Ti through n k , 
and then evaluate the [n— /ej-torm that 
results at P 


D^...D„J{P) 

= (f, 7Tl ---7i k P n - k /{n-k)\) 


D ni ...D n J(P) 

= (/, (n!/(n — A;)! )7Ti • • • 7TfcP ) 


differentiate an n-form k times, in the 
directions of the vectors 7Ti through 7ik 


■ --DnJ = /l_7Ti • • -7l k 


D W1 • • • Ar fc / = / l- (n!/ (n - A;)! )tti • • • 7r fe 


differentiate an n-form n times, in the 
directions of the vectors 7Ti through n n , 
thus producing a constant 


An • -'D V J = (/,7Ti • • -7T n ) 


Dm - ■ ■ D Wn f = (f, n! 7Ti • • • ir n ) 


differentiate an n-form n times, each 
time in the direction of the vector ir, 
thus producing a constant 


(D^f = (f,ir n ) 


(£) w )V = </,n!T n ) 
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summing averaging 



evaluate an n-form at a 
point 


<E P ) n f = 


(/, P n /n\ ) 


(E P ) n f = 


~- (f,P n ) 

\J 7 / 


set one argument of the blos- 
som of an n-form to a point 


E P f = 


f^P/n 


E P f = 


f<-P 


differentiate an n-form n 
times, each time in the di- 
rection of the same vector 


(Ar)"/ 


= (f,* n ) 


(D w ) n f = 


(f,n\n n ) 


differentiate an n-form once, 
in the direction of a vector 


D«f-- 


= />-*■ 


D*f = 


f \-n-K 



Table B.2: Basic formulas under the summed and averaged pairings 



Comparing the first line in Table B.l to the last, we are reminded that 
evaluating an n-form is like differentiating n times, always in the same di- 
rection — except for that factor of n! . Comparing the second line to the 
next-to-last, we see that evaluating the blossom of an n-form is like differen- 
tiating n times, in arbitrary directions — again, except for the factor of n! . 
But is there any evaluation-like operator that corresponds to differentiating 
only once? We can define such an operator as follows, an operator E P that 
does (l/n) th of the work of evaluating an n-form at P. In blossoming terms, 
what Ep does is to set one of the arguments of the blossom to P; that is, we 
require that 

(Epf)~(Ql, Qn-l) = f(Ql, Qn-1, P). 

We then have (E P ) n f = f(P). Using this new operator E P , we can boil 
down Table B.l into the more basic Table B.2, above. 

Table B.2 shows that the summed pairing makes differentiation simple, 
hence forcing evaluation to divide, while the averaged pairing makes evalua- 
tion simple, hence forcing differentiation to multiply. Since evaluation seems 
more basic than differentiation, perhaps it is better to average. 

On the contrary: There is a deep reason why summing is better; but it 
will take us a few paragraphs to develop the argument. The first step is 
to note that some of the numeric factors of n or n! in Table B.2 are more 
expensive than others, the expensive ones being the ones on lines two and 
four, the lines where the left-hand sides don't involve n. 

On lines one and three, we are evaluating an n-form completely or differ- 
entiating it n times. That is, we are applying the n th -order operator (Ep) n 
or (D n ) n to set all n factors of the n-site with which that n-form gets paired. 
So we have to know n — we have a good reason for needing to know it. If 
we have to multiply or divide by n! as well, that is only an annoyance. 
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But lines two and four are a different story. It makes sense to apply the 
operator Ep or D n to an n-form / without knowing n; indeed, we can even 
apply those operators to inhomogeneous forms. If we are forced to require 
that / be an n-form solely because we have to multiply or divide by n, that 
is worse than annoying; it could be crippling. 

For an example of those crippling effects, consider the product rule for 
differentiation: D n (fg) = D n (f)g + fD n (g). This rule is valid for all forms / 
and g, independent of their degrees; in fact, it is valid even for forms / and 
g that are inhomogeneous. Under the summed pairing, that rule turns into 
an analogous rule for contracting a product on an anchor, also valid for all 
forms / and g: 

(B.l-2) fg i_ n = (/i_7r)(/ + f(g\- ir) [under summing]. 

But suppose that we choose averaging instead of summing, so that D n (f) = 
f\-nn. Then, in order to state the rule for contracting a product on an anchor, 
the two forms involved have to be homogeneous and we have to know their 
degrees; if / is an n-form and g is an m-form, we have 



fg\- (n + m)ir = (/ 1_ mr)g + f(g\- mix) 



under averaging 
with deg(/) = n 
and deg(g) = m. 



The crippling thing here is replacing a single, general rule with a two- 
parameter family of narrow sub-rules. It is the off-diagonal sub-rules that 
are particularly crippled, in this case. On the diagonal m = n, where both / 
and g are n-forms, we can divide through by m + n = 2n to get 



, (f^n)g + f(g^7i) 
ta^K = o 



under averaging 
with deg(/) = deg(g-). 



Note that this rule differs from Rule B.l-2 as averaging differs from summing. 

So the multiplication by n in the averaged formula D n f = f i_ mr is 
crippling. What about the division by n in the summed formula Epf = 
ft-P/n; is it also crippling? It has the potential to be, since it forces us 
to know the degree of a form / before we can apply the operator E P to /. 
If there were powerful identities in which the operator Ep was applied to 
inhomogeneous forms, summing would cripple our ability to translate those 
identities into the paired-algebras framework. But there aren't. That's why 
you've known about the operator D n for decades, while you learned about 
E P only a minute ago — because D n has the powerful identities, not E P . 
For example, the product rule for Ep is a two-parameter family of sub-rules; 
when / is an n-form and g is an m-form, we have 

Tl Tfl 

Ep(fg) = ^—(Epf)g + ——f(E P g). 
n + m n + m 



B.2. OTHER OPTIONS FOR AVOIDING ANNOYANCE 
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We should tolerate being annoyed to avoid being crippled, so lines two and 
four in Table B.2 matter more than lines one and three. Since the operator 
D n satisfies more powerful identities than does E P , line four matters more 
than line two. To me, at least, this seems like a strong argument in favor of 
choosing the pairing that makes line four simple: the summed pairing. 

The choice of whether to sum or to average is an issue of taste on which 
reasonable people may differ. And averaging definitely seems more attractive 
initially — I started out by averaging myself. But I now believe that CAGD 
will be better served in the long run if we adopt the summed pairing. 

By the way, pure mathematicians often sum, rather than average, because 
they want their theory to apply over fields of prime characteristic. Over a 
field whose characteristic is a prime p, we have n\ = 0 for all n > p, so we 
can't divide by n\ . That argument is irrelevant in CAGD, of course, where 
the only fields of interest are the reals and perhaps the complexes. But it 
certainly isn't a bad thing for us in CAGD to adopt the same convention as 
the majority of pure mathematicians, since that will make it easier for us to 
read their textbooks. 

B.2 Other options for avoiding annoyance 

We have discussed the summed pairing and the averaged pairing as if they 
were our only possible options: a binary choice. There are other alternatives, 
of course, and several of them are worth mentioning — although not, in my 
opinion, worth adopting. 

One way to avoid the annoying factors would introduce both the summed 
pairing and the averaged pairing, using different notations (perhaps ( , ) and 
[ , ]). We could then use the summed pairing in formulas for differentiation 
and the averaged pairing in formulas for evaluation. But having two pairings 
around would be confusing. For example, when describing one basis as being 
dual to another, we would have to specify under which pairing. The cost in 
confusion would probably outweigh the benefit in reduced annoyance. 

Several other schemes for mitigating the annoying factors involve rescaled 
exponentials, which incorporate a denominator of n\ into the concept of an 
n th power. I learned of this clever idea from Greub's text [26]; I don't know 
who first came up with it. There are several ways to exploit this idea, so 
that we can adopt the summed pairing and still end up with pretty evaluation 
formulas. In one of those ways, the rescaling of the exponentials is explicit; 
in the other, it is implicit. 

With explicit rescaling, we would introduce a new notation, perhaps 
x n/\ ._ x n j n \ ^ reading as u x to the n slash bang". Exploiting this new 

notation, the formula for evaluating an n-form / under the summed pairing 
would be f(P) = (f, P™/ ! ), which is slightly prettier than f(P) = (f, P n /n\ ). 
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We could also write other famous formulas more neatly using these rescaled 
exponentials. For example, we could expand an analytic function / : R — > R 
in a Taylor series around b as 

/! 



f(x) = j2f (n) mx-br/'- 



n>0 

Even better, we could write the Binomial Theorem as 

0<k<n 

All three of the factorials that normally constitute the binomial coefficient 
are hidden here in the three rescaled exponentials. But other formulas 
are prettier using standard exponentials. For example, the rule x l x^ = x l+ i 
for multiplying by adding exponents is prettier than the alternative 



So we would probably end up using both rescaled and standard exponentials. 
I am afraid that the cost in confusion from having two different exponentials 
would outweigh the small benefit in reduced annoyance. 

A more radical approach would implicitly rescale all exponentials of an- 
chors. Rather than introducing a new notation, we would define p n , for an 
anchor p over A, to mean 

n factors 
P-P---P 

v ■= ; — , 

n\ 

leading to the summed-pairing evaluation formula f{p) = (f,p n )- But that 
radical approach is a nonstarter in our context, because it would destroy 
the symmetry between the algebras of forms and sites. In building up a 
form out of coanchors, it is a universal, well-established convention that the 
exponentials used are standard exponentials. Symmetry requires that we use 
standard exponentials also in building up a site out of anchors. 

By the way, Greub adopts that radical approach in building his symmetric 
algebras. But things are a bit different for Greub because he uses the special 
symbol "V" to denote the multiplication in his symmetric algebra, rather 
than a centered dot or simple juxtaposition. The products in his symmetric 
algebras hence look less like standard products, so it is not so shocking that 
his exponentials, 

n factors 



x V x V ■ • • V x 
x := ; . 



71! 

are rescaled, rather than standard. (Whether rescaling or not, some authors 
would prefer to write x yn in this context, rather than x n .) 
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C.l Thoughts about functional notations 

In this monograph, we follow the standard mathematical conventions for 
notating functions in three respects: 

1. When applying a function / to a datum x, we write f(x), with the 
function on the left; that is, we denote functions as prefix operators. 
The alternative would be some flavor of postfix operator, as in the 
expressions (x)f, xf, or . 

2. We compose functions from right to left, so that applying fog means 
applying first g and then /. The alternative, of course, would be a 
composition operator that worked from left to right. (If you ever need 
such an operator, consider using a semicolon, setting g;f := fog. The 
analogy with computer programs almost requires that applying g;f 
means applying g first.) 

3. When denoting the type of a function, we give its domain first and its 
codomain second, connected by a rightward-pointing "to" arrow; for 
example, we might declare / to be "a function from A to B" and write 
/: A — > B. The alternative would be to give the codomain first and 
the domain second, connected by a leftward-pointing "from" arrow. In 
this alternative scheme, we would declare that same function / from A 
to B as "a function to B from A" and write / : B <— A. 

Unfortunately, the third of those standard conventions is inconsistent with 
the first two. Prefix application and right-to-left composition interact better 
with "from" arrows than with "to" arrows. For example, it was our use of 
"to" arrows that made the declaration of the linear map h±2 : X 2 — > Xi in 
Proposition 9.1-3 seem backward. We would have been better off if we had 
declared the map h\ 2 with a "from" arrow, as h^' X\ <— X 2 . 
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Sadly, we mathematicians may be stuck with this inconsistency for a long 
time. Switching to "from" arrows works fine in limited contexts; but do we 
really want all of the arrows in our commutative diagrams to point from right 
to left? And, probably also, for consistency, from bottom to top? Postfix 
application and left-to-right composition also work fine in certain contexts. 
But how do you read the postfix application (x)f in English? The prefix 
application f(x) is "/ of x" or "/ at x" . Perhaps (x)f could be u x into /" or 
u x through /" or u x sent to /" ? Also, suppose that we want to deemphasize 
the argument x by setting it in smaller type. If / is a prefix operator, there 
is no difficulty: The argument x becomes a subscript, as in f x . Are we ready 
for subscripts on the left, as in x f7 How would we read that expression? 
What about double subscripts, as in nX f? 

By the way, duality is one context where it might be helpful to adopt both 
of the sets of consistent conventions simultaneously, using one for primal 
functions and the other for their duals. For example, we could use prefix 
application, right-to-left composition, and "from" arrows on the primal side, 
but postfix application, left-to-right composition, and "to" arrows on the 
dual side. Thus, we might introduce the primal linear functions / : X <— Y 
and g : Y <— Z. Their composition fog would then be applied as a prefix 
operator to an element z of Z, getting / o g(z) = f(g(z)) in X. The dual 
functions would then be /* : X* — > Y* and g* : Y* — > Z*. Their composition 
/*; g* would be applied as a postfix operator to an element x* of X*, getting 
(x*)f*;g* = ((x*)f*)g* in Z* . One advantage of this scheme is that the 
primal operator / and its dual /* are represented by the same matrix. To 
apply /, we multiply that matrix by a column vector on the right; to apply 
/*, we multiply that same matrix by a row vector on the left. We would have 
the identity (/ o g)* = /*; g*, an analog of de Morgan's Law. 

C.2 Paired algebras in wilder contexts 

The paired-algebras framework for CAGD exploits some mathematics that 
produces a pair of algebras (Sym(A), Sym(Y)) from a dual pair (X,Y) of 
linear spaces. As we developed that theory in this monograph, we mentioned 
various things that can go wrong in contexts wilder — that is, more general 
- than ours. In this section, let's review that story, starting out with the 
most restrictive assumptions, which lead to the prettiest theory. 

C.2.1 Fields of characteristic zero 

For the very prettiest theory, the spaces X and Y should be finite-dimensional 
linear spaces over the complex numbers, or, more generally, over any alge- 
braically closed field of characteristic zero. For example, over such a field, 
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every univariate n-ic splits into n linear factors, and a bivariate quadratic 
splits into two linear factors just when its discriminant is zero. 

The real numbers are the field of scalars that is most relevant in CAGD. 
The reals have characteristic zero, but are not algebraically closed. That 
makes factoring forms and sites more complicated. But the rest of the theory 
is unaffected, as it would be over any field of characteristic zero. 

C.2.2 Fields of prime characteristic 

Suppose that we start with finite-dimensional linear spaces X and Y, but over 
a field whose characteristic is a prime p. For simplicity, let's assume for now 
that the scalar field is infinite; for example, it might be the algebraic closure of 
the field Z/(p) for some prime p or the field (Z/(p))(t) of rational functions in 
an indeterminate t, where the coefficients in the numerator and denominator 
polynomials are taken from Z/(p). We can still build the symmetric algebras 
Sym(X) and Sym(Y). And, for each n, there is still a unique scalar- valued 
bilinear map on Sym n (X) x Sym n (Y) that satisfies the Summed Permanent 
Identity. Unfortunately, that map is no longer a pairing, in general. Indeed, 
as soon as n is at least p, the Summed Permanent Identity implies that 
(x ± ■ ■ ■ x n , y 1 ■ ■ ■ y n ) = 0 when either x x = • • • = x n or y x = • • • = y n , since 
the resulting sum then has n\ = 0 identical terms. The Averaged Permanent 
Identity is even worse, since it tries to get rid of this factor of n\ = 0 by 
dividing by it. The linear spaces Sym n (X) and Sym n (F) still have the same 
dimension, so there still exist lots of pairing maps between them — but none 
of those pairing maps interact with the multiplications in the two algebras 
as specified by the Permanent Identity. 

It is a serious blow to the theory that the bilinear map defined by the 
Permanent Identity is not a pairing. Furthermore, that failure cannot be 
repaired by, for example, rearranging the numeric factors in some clever 
way; rather, the case of prime characteristic really is different. To see how, 
consider the cubing map a: A — > Sym 3 (A) given by a(q) := q 3 for any 
anchor q over A and the evaluation-of-a-cubic map e: A — > Sym 3 (A*)* given 
by e(q)(f) = e q (f) := f(q). Over a field of characteristic zero, the maps a 
and e behave essentially the same; so, by choosing the proper pairing, we 
can represent Sym 3 (A*)* as Sym 3 (A) in such a way that e(q) is represented 
by cr(q), for all anchors q. (The pairing that does precisely that, of course, 
is the averaged pairing; under the superior summed pairing, e(q) = a(q)/6.) 
Over a field of characteristic p — 3, however, the maps a and e behave quite 
differently. For example, when A is an afline plane, we have 

o(wC + wp + vip) = (wC + wp + vipf = w 3 C 3 + u 3 ip 3 + v 3 ip 3 , 

with the remaining seven terms dropping out, due to their multinomial coeffi- 
cients of 3 or 6. Thus, the cube of any anchor over A lies in the 3-dimensional 



150 



APPENDIX C. MORE MATH REMARKS 



subspace of Sym 3 (A) spanned by the 3-sites C 3 , (p 3 , and ip 3 . Over any infinite 
field, however, even one of prime characteristic, the values of any polynomial 
determine its coefficients uniquely. Hence, the linear functionals in the set 
e(A) span all 10 dimensions of the space Sym 3 (A*)*. So there is no hope 
that some juggling of scale factors could allow us to represent Sym 3 (A*)* as 
Sym 3 (A) in such a way that e(q) is represented by cr(g), for all q. 

By the way, over fields of prime characteristic, the term "Veronese" is 
associated with evaluating an n-ic, rather than with raising to the n th power. 
For example, in characteristic 3, the "Veronese surface of parametric degree 
3" refers to the image of the evaluation map e: A — > Sym 3 (A*)*, rather than 
to the (degenerate) image of the cubing map a: A — > Sym 3 (A). 

C.2.3 Finite fields 

If our field of scalars is not only of prime characteristic but is actually fi- 
nite, then the values of a polynomial are no longer enough, in general, to 
uniquely determine the coefficients of that polynomial. This means that the 
evaluation-of-an-n-ic map e can be degenerate, as well as the raise-to-the- 
n th -power map a. 

For an example, let A be an affine plane once again, but now over the field 
Z/(3) of cardinality 3. So there are only 9 points in A and only 27 anchors 
over A. Consider the ninth-power map er: A — > Sym 9 (A) and the evaluation- 
of-a-nonic map e: A — > Sym 9 (/i*)*. The equation a(q) = q 9 = (q 3 ) 3 shows 
that a(wC+u(p+v ip) = w 9 C 9 +u 9 \p 9 +v 9 ip 9 , so the ninth powers of all anchors 
over A lie in a 3-dimensional subspace of Sym 9 (A); thus, the map a is quite 
degenerate. But the evaluation map e must be somewhat degenerate also, just 
by counting. Since there are only 27 anchors over A, the linear functionals 
in the set e(A) can't span more than 27 dimensions, while the full space 
Sym 9 (A*)* has dimension 55. (In fact, since e(0) = 0 and e(—q) = — e(q), we 
can tighten the bound on the dimension from 27 to (27 — l)/2 = 13, and 13 
turns out to be the exact answer.) 

This degeneracy of e torpedoes one of the three concrete constructions 
of the symmetric algebra Sym(X) that we discussed in Section 5.1, the one 
that exploits duality. Over a finite field, we have to construct the algebra 
Sym(X) by dealing in some way with polynomials whose variables lie in X. 
We can't exploit duality to replace each such polynomial by the scalar-valued 
function on Y = X* that it defines, since two distinct such polynomials may 
define the same function. 

On the other hand, some things get nicer over a finite field. A subspace, 
such as a line or a plane, has only finitely many points in it. So we can 
associate, with such a subspace, a polynomial that has all of the points in 
that subspace as its roots; see Macdonald [39]. 
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C.2.4 Linear spaces of infinite dimension 

If X is a linear space of infinite dimension, then its dual space Y := X* is 
vastly bigger than X, so the concept of a pairing breaks down already in 
degree 1. We can still build the algebra Sym(X), and it is still an algebra 
of polynomials — albeit polynomials in an infinite number of variables. And 
we can build the algebra Sym(F) as well. But Sym(F) is so much bigger 
than Sym(X) that those two algebras cannot be paired (except in degree 0, 
where Sym 0 (X) = Sym 0 (F) = R). 

For those who are curious, here is what I mean when I say that the dual 
space X* is "vastly bigger" than X. Let F be a field and let X be a linear 
space over F whose dimension k := dim F (X) is infinite; so k is an infinite 
cardinal. It then turns out [6] that dim F (X*) = \X*\ = \F\ K > 2 K > k. 

C.2.5 Modules over commutative rings 

Things get still wilder when we generalize from linear spaces over a field to 
modules over a commutative ring R. Here are a few of the problems that 
can arise in that context. 

First, the additive group of the ring R may have torsion. If so, we end up 
with all of the problems associated with fields of prime characteristic, only 
worse. Different elements of R may have different additive orders, and those 
orders need not be prime. 

Second, as we mentioned in Section 9.3, only the nicest .R-modules M, 
the free modules, have any bases at all. So a construction of the symmetric 
algebra Sym(M) that starts by choosing a basis may not be applicable. The 
field of rational numbers Q, viewed as a module over the integers Z, provides 
a simple example of a torsion-free module that is not free. 

Bourbaki [4, 5] is a good source for the theory of the symmetric algebra 
at this level of generality and subtlety. 

C.2.6 The division ring of quaternions 

What about a ring R that isn't commutative? We can still define the concept 
of an i?-module, but we must now specify whether scalars multiply from the 
left or from the right. Let's think about left R- modules, where the scalar 
multiplication takes r in R and m in M to rm in M. Things are now wilder 
yet; for example, a free left i?-module can have bases of different cardinalities. 

Exercise C.2-1 Consider real (or integral — it doesn't matter) matrices 
m := {iTiij)ij>Q with infinitely many rows and columns, each of which is 
eventually identically zero. The product of two such matrices is another such, 
so the set of all such matrices forms a noncommutative ring R. Consider the 
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ring R itself as a left i?-module. The module R is free, since the identity 
matrix i forms a basis of cardinality one; for any m in R, the equation 
m = ri has the unique solution r = m. Show that the following matrices j 
and k form a basis of cardinality two: 



J := 



/l 0 0 0 0 0 ...\ 
0 0 1 0 0 0.. 
0 0 0 0 1 0.. 

V J 



/0 1 0 0 0 0 
0 0 0 1 0 0..' 
0 0 0 0 0 1.. 

V J 



That is, for any m in R, show that there exist unique matrices r and s in R 
with m = r j + sk. 

Left modules over division rings are much better behaved - - indeed, 
are almost as well behaved as linear spaces over fields. A division ring 
(a.k.a. skew field) is a nonzero''' ring in which every nonzero element is in- 
vertible. If R is a division ring, then every left i?-module M is free and all 
bases for M have the same cardinality. 

The division ring that is most likely to be of interest in CAGD is the 
quaternions H. It would be interesting to study how much of the theory of 
the paired algebras survives when working with left H-modules. One bad 
sign is that only a pale shadow of the theory of determinants carries over to 
matrices of quaternions. While we would naively hope that the determinant 
of a matrix m of quaternions was itself a quaternion, say h, the best that can 
be done — see Artin [2] — is to define det(m) to be a nonnegative real number 
that captures, roughly speaking, the norm \h\. The theory of permanents of 
matrices of quaternions is likely to be a shadow that is similarly pale, and 
that would be bad news for the paired algebras. 



t There is a unique ring {0}, called the zero ring, in which 1 = 0; we don't want the 
zero ring to qualify as a division ring. 
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